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Unintended consequences 


After the introduction of a clumsily worded new rule, the UK government should move quickly 
to reassure scientists that they can continue to advise policymakers. 


02 March 2016 


Only a fool ignores well informed advice. And only a very foolish government demands not to receive it 
in the first place. 


But that is what the British government is in danger of doing. Last Related stories 


month the Cabinet Office — the ministry that supports the Prime 


Minister in running the government — introduced a new condition Pome ipo gas 


attached to government grants. scientists face 


government ‘gagging’ 


A new rule warns that money from any grant, either issued direct from Gaulse 

departments or through third parties, cannot be used to “support + Canadian election brings 
activity intended to influence or attempt to influence Parliament, hope for science 
government or political parties ... or attempting to influence legislative « Communication 

or regulatory action”. breakdown 


Despite increasing concern from academics, the Department for 

Business, Innovation and Skills, responsible for billions of pounds of research funding, could not say as 
Nature went to press whether the rule would apply to science grants and university funding. The 
research councils and the Higher Education Funding Council for England — which actually parcel up the 
money for institutions and academics — are equally in the dark. All of which leaves scientists fearing that 
they are about to be muzzled. 


This situation is not as shocking as in Canada, where a previous 
“The UK case seems to be ; 

government deliberately set out to gag its researchers. Instead, the 
more cock-up than ; - 

. . UK case seems to be more cock-up than conspiracy. No official 

conspiracy. . 

seems to have thought through what it might mean to stop anyone 

who receives government money saying anything of substance to 

government. The Cabinet Office says that the clause was introduced to stop bodies that rely on 


government funds from lobbying government for more funding. 


What could be lost if this clause is implemented fully is unclear. The specifics of how it will work have not 
been set out in great detail. But it could cover some of what government-funded scientists already do. A 
group of cross-party politicians in the House of Lords, for example, is conducting an inquiry into what 
impact the result of the United Kingdom’s pending referendum, on whether to stay in or quit the 
European Union, would have on science. Among those giving evidence to the Lords are seven research 
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institutions that are either government-owned or receive substantial government grants, including the 
Met Office weather agency, animal-health centre the Pirbright Institute in Surrey and the plant experts at 
the John Innes Centre in Norwich. Then there are nine universities or university centres, plus individual 
professors. 


All the evidence culled from this wealth of expertise could be jeopardized by a heavy-handed 
implementation of this clause — for what is the point of evidence that has no influence? Even if these 
groups did still give evidence, some of it would have to be watered down or heavily qualified. Academic 
input has enlightened discussions of climate change, pollinator declines, biomedical ethics and many 
other issues of crucial importance to the future of the United Kingdom and the wider world. 


The clause does not even limit itself to activity that tries to influence the United Kingdom; it merely 
forbids “attempting to influence legislative or regulatory action”. Will some of the world’s leading climate 
scientists be prevented from contributing to the Intergovernmental Panel on Climate Change’s Summary 
for Policymakers because they are dependent on government money? Should British wildlife experts not 
give policy advice to foreign nations attempting to save biodiversity? Surely not, but this is one possible 
reading of the clause. 


Scientists in the United Kingdom could be forgiven for feeling baffled by the development, given that the 
government-funded research councils have spent recent years promoting the ‘impact agenda’. This 
encourages scientists to make sure that their work has reach outside their own academic disciplines, 
including influencing policy and legislation. 


Officials have indicated that the problem can be fixed. Ministers have the power to remove the rule 
entirely from grants, or add in ‘qualifications’ that could permit some limited additional uses for the 
money. 


All researchers supported by government — regardless of what organizational auspice they operate 
under — should be in no doubt that they have not only a right but a duty to speak out about the 
implications of their work. There must be a complete exemption of any research from this clause, not 
just for those who work in academia, but for those who work directly for government. 


Nature 531, 7 (03 March 2016) doi:10.1038/531007a 


Tweet Follow @NatureNews 


Related stories and links 


From nature.com 
¢ Confusion reigns as UK scientists face government ‘gagging’ clause 
26 February 2016 
e« Canadian election brings hope for science 
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20 October 2015 
¢ Communication breakdown 
01 April 2015 
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Guidelines. You will be re-directed back to this page where you will see comments updating in real-time 
and have the ability to recommend comments to other users. 


1 comment Subscribe to comments 
@ Robert Ward - 2016-03-02 03:56 PM 
@@™®. A petition has been launched to persuade the UK Government to exempt university researchers 


from the new ‘anti-lobbying' clause. You can sign it here: 
https://petition.parliament.uk/petitions/122957 
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Future present 


A young global-sustainability platform deserves time to find its feet. 
01 March 2016 


In pure science, as in art, little is urgent. Gravitational waves were discovered — a triumph for curiosity- 
driven science — thanks to physicists’ patience and imaginative power. That they had waited decades is 
irrelevant. Alas, not all science has the luxury of timelessness. 


t sci touch i that rank high on th ial 
Urgent science touches on issues that rank high on the social agenda Risisted stoves 


Theorists have classified fields such as climatology and global-change 


; . Adaptation trade-off 
research as post-normal science, in which socio-economic stakes are a 


high and decisions are pressing. That is the case with the agenda of ¢ Finite Earth 

Future Earth, an international sustainability-research platform set up in « The political economy of 
2012 to tackle complex social challenges, from climate change to climate adaptation 
finance. 


More related stories 


The scheme replaces a number of narrower programmes, including 
the international Geosphere-Biosphere and Human Dimensions programmes, which — to the regret of 
many — are now all closed. 


Sustainability researchers will need to follow a multidisciplinary— nay, transdisciplinary — approach that 
goes beyond what many scientists have been used to. Future Earth’s ‘co-design’ intends natural and 
social scientists to plan and carry out research with outside experts. Whether that will win over academic 
researchers, stakeholders and, crucially, funders remains to be seen. To convince sceptics, the scheme 
needs to provide a successful example of how it will work in practice. 


Preservation of the natural commons, such as atmosphere, water, land and oceans, for future 
generations is vital and a cause to which any responsible scientist will happily subscribe. But combining 
the conventional scientific methods of the natural and social sciences with knowledge from various other 
sources — land owners and planners, insurance companies, conservation groups, emergency 
organizations and political decision-makers — poses conceptual and organizational challenges. 


A cross-community Future Earth workshop on adaptation and responses to extreme climate events, 
held last month in Berlin, offered a taste of such challenges (see go.nature.com/6utfmi) and might serve 
as a test for the design of research networks on sustainability issues. Under time pressure, participants 
had to draft a research strategy to address the drivers and implications of extreme events, and make it 
fit with Future Earth’s conceptual framework — a tough issue. The workshop asked scientists from 
different academic cultures to do that work, which produced semantic confusion and the odd unhelpful 
generalization. 
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But the workshop was not in vain. Many participants (a healthy number of whom were from developing 
nations) said that they revelled in being pushed out of their comfort zones. They produced several 
meaty research questions, including some genuinely new ideas for how the social and natural sciences 
could interact. For example, when do climatic and socio-economic factors combine to amplify the 
impacts of climate extremes and induce cascading harm? Are there ‘tipping points’ at which social or 
natural systems might fail to recover from shocks? And how might science-based adaptation work in 
data-scarce regions? 


The ideas found an audience. Representatives of funding agencies at the workshop cautiously indicated 
that the proposals stand a good chance of getting funded by the Belmont Forum, a worldwide group of 
21 major funders of global environmental-change research. 


But governments and grant-giving agencies have not yet firmly committed to funding Future Earth as a 
whole. The reluctance comes from uncertainty over what the scheme might be able to deliver. The 
closure of successful programmes in favour of something fashionable but conceptually unproven has 
earned Future Earth sceptical glances. But then, it was launched in response to complaints that 
previous programmes were not sufficiently linked and that the knowledge they produced was scarcely 
picked up in practice. There is no lack of studies, for example, on how extreme heat, rain and wind affect 
farmers, city dwellers and coastlines in many parts of the world. But the results are almost useless if 
they never make it beyond the pages of academic journals. 


Future Earth will need to make sure that scientific evidence comes to the desks of decision-makers, no 
matter what they might then make of it. But the programme should also avoid retreading familiar ground. 
The mountain of data from previous programmes, including countless climate-change studies, remains 
relevant — even if the information hasn’t yet been put to constructive use. 


Future sustainability research, no matter how interdisciplinary, should build on that heritage and focus 
on finding and closing knowledge gaps. In doing so, scientists involved in Future Earth can provide an 
invaluable service to society. And researchers in niche disciplines — palaeoclimatology or behavioural 
science, say — who work to fill those gaps will get a welcome chance to put their work into a broader 
context. 


Future Earth might also become a showcase for linking natural and social sciences — a real necessity 
given that human activity is altering the planet at worrying speed. But sustainability research must not 
become tied in the straitjacket of conceptualism and utilitarianism. Scientists are not merely service 
providers. As in any other field of science, sustainability research must remain at its core a curiosity- 
driven affair. 


Nature 531, 7-8 (03 March 2016) = doi:10.1038/531007b 
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Related stories and links 


From nature.com 

e Adaptation trade-offs 
23 October 2015 

e Finite Earth 
21 September 2015 

e The political economy of climate adaptation 
24 June 2015 

e Coastal conundrums 
28 January 2015 

e Climate panel says prepare for weird weather 
18 November 2011 

e Climate and weather: Extreme measures 
07 September 2011 


From elsewhere 
¢ Future Earth 
e UN Sustainable Development Goals 
e Extreme Events and Environments — from climate to society 
e International Geosphere Biosphere Programme (closed) 
e International Human Dimensions programme (closed) 
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Guidelines. You will be re-directed back to this page where you will see comments updating in real-time 
and have the ability to recommend comments to other users. 
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Sustainability researchers will need to follow a multidisciplinary 
— nay, transdisciplinary — approach that goes beyond what many 
scientists have been used to. Future Earth's ‘co-design’ intends natural 
and social scientists to plan and carry out research with outside experts. 
Whether that will win over academic researchers, stakeholders and, 
crucially, funders remains to be seen. To convince sceptics, the scheme 
needs to provide a successful example of how it will work in practice. 

Preservation of the natural commons, such as atmosphere, water, 
land and oceans, for future generations is vital and a cause to which any 
responsible scientist will happily subscribe. But combining the conven- 
tional scientific methods of the natural and social sciences with knowl- 
edge from various other sources — land owners and planners, insurance 
companies, conservation groups, emergency organizations and political 
decision-makers — poses conceptual and organizational challenges. 

A cross-community Future Earth workshop on adaptation and 
responses to extreme climate events, held last month in Berlin, offered 
a taste of such challenges (see go.nature.com/6utfmi) and might serve as 
atest for the design of research networks on sustainability issues. Under 
time pressure, participants had to draft a research strategy to address the 
drivers and implications of extreme events, and make it fit with Future 
Earth’s conceptual framework — a tough issue. The workshop asked 
scientists from different academic cultures to do that work, which pro- 
duced semantic confusion and the odd unhelpful generalization. 

But the workshop was not in vain. Many participants (a healthy 
number of whom were from developing nations) said that they 
revelled in being pushed out of their comfort zones. They produced sev- 
eral meaty research questions, including some genuinely new ideas for 
how the social and natural sciences could interact. For example, when do 
climatic and socio-economic factors combine to amplify the impacts of 
climate extremes and induce cascading harm? Are there tipping points’ 
at which social or natural systems might fail to recover from shocks? 
And how might science-based adaptation work in data-scarce regions? 

The ideas found an audience. Representatives of funding agencies 
at the workshop cautiously indicated that the proposals stand a good 


chance of getting funded by the Belmont Forum, a worldwide group of 
21 major funders of global environmental-change research. 

But governments and grant-giving agencies have not yet firmly 
committed to funding Future Earth as a whole. The reluctance comes 
from uncertainty over what the scheme might be able to deliver. The 
closure of successful programmes in favour of something fashionable 
but conceptually unproven has earned Future Earth sceptical glances. 
But then, it was launched in response to com- 


“Sustainability plaints that previous programmes were not 
research must sufficiently linked and that the knowledge 
not become they produced was scarcely picked up in prac- 
tied in the tice. There is no lack of studies, for example, 
straitjacket of on how extreme heat, rain and wind affect 


conceptualism.” farmers, city dwellers and coastlines in many 
parts of the world. But the results are almost 
useless if they never make it beyond the pages of academic journals. 

Future Earth will need to make sure that scientific evidence comes 
to the desks of decision-makers, no matter what they might then make 
of it. But the programme should also avoid retreading familiar ground. 
The mountain of data from previous programmes, including countless 
climate-change studies, remains relevant — even if the information 
hasn't yet been put to constructive use. 

Future sustainability research, no matter how interdisciplinary, should 
build on that heritage and focus on finding and closing knowledge gaps. 
In doing so, scientists involved in Future Earth can provide an invaluable 
service to society. And researchers in niche disciplines — palaeo- 
climatology or behavioural science, say — who work to fill those gaps 
will get a welcome chance to put their work into a broader context. 

Future Earth might also become a showcase for linking natural and 
social sciences — a real necessity given that human activity is altering the 
planet at worrying speed. But sustainability research must not become 
tied in the straitjacket of conceptualism and utilitarianism. Scientists are 
not merely service providers. As in any other field of science, sustain- 
ability research must remain at its core a curiosity-driven affair. m 


Brain power 


As brain stimulation finds non-medical uses, 
now is the time to consider its implications. 


year, neurosurgeons in the United States reported odd symptoms 
in three of their patients. Well into their sixties and seventies, these 
people complained of headaches, nausea, unstable balance, weak legs, 
low blood pressure and falling down. Chest X-rays revealed the problem. 
Two devices implanted into their chests — a pacemaker to help their fail- 
ing hearts and a battery unit that powered electrodes buried deep inside 
their heads to control the signature tremors of Parkinson's disease — had 
been placed too close together. One machine was interfering with the 
functioning of the other (M. Sharma et al. Basal Ganglia 6, 19-22; 2016). 
From iron lungs and dialysis machines to implantable defibrillators, 
we are used to technology helping our bodies. Deep-brain stimula- 
tion — the electrodes and battery implanted in the patients’ heads — has 
been helping people with neurological and psychiatric disorders for 
more than a decade, but it requires quite a commitment. Brain surgery 
is expensive and not for everybody. The number of people who might 
benefit is very small given the overall burden posed by mental illness 
and related problems. This is one reason why there is a lot of interest 
in cheaper and easier types of brain stimulation, which apply electric 
current and magnetic fields to the outside of the head. 
If these types of brain stimulation are found not to produce much 
of a difference, then it will not have been for a lack of effort. Academic 


L is a cautionary tale for twenty-first-century medicine. Late last 
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journals are filling up with case reports and preliminary trials of the 
technologies to help people with depression, autism spectrum disorders, 
schizophrenia, obsessive-compulsive disorder, addiction, anxiety and 
many more cognitive problems. It is early days, but there are enough 
positive results to draw the attention of people who struggle with such 
issues or know someone who does. Some of these people want to try it 
on themselves or their children. An electrical brain stimulator is fairly 
simple to make, and even simpler to buy from one of the companies that 
are popping up to sell them online. Self-medication has never been so 
high-tech. 

Many neuroscientists have raised the alarm over do-it-yourself (DIY) 
brain stimulation, pointing out that it can be unsafe in the short term 
and might have side effects in the long term. Some want regulation. 
But there is another, more fundamental, ethical issue that must be con- 
fronted. As we report in an Outlook article on page S6 — one ofa series 
of pieces that discuss cognitive enhancement — the use of DIY brain 
stimulators is not confined to those for whom conventional medicine 
has failed. A small but growing number of people want to use the devices 
to improve their natural mental abilities. And in so doing, this com- 
munity is piggy-backing on scientific studies that suggest that electric 
currents and magnetic fields could improve academic performance by 
boosting memory and attention, and perhaps even alter attitudes. 

The use of medicines to enhance performance in sport is frowned 
on, and a clear line has been drawn between taking them to treat and 
taking them to cheat. Could a similar distinction be made for cognitive- 
enhancement techniques? Should it be? It’s too 
soon to answer some of these questions — scien- 
tists and doctors must first reach consensus on the 
effectiveness of the techniques — but it is not too 
soon to ask them. = 
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SUPREME COURT 


WORLD VIEW .ecnisicosscn 


career as a judge — is a long-standing and important feature of 

legal proceedings. The scientists, engineers, inventors and tech- 
nologists who offer their opinions in court are encouraged to agree on 
basic points before a trial begins. But they often do not agree as much 
as we hope. That tends to lengthen the time taken to cross-examine 
them, and contributes to justice being an expensive, drawn-out and 
stressful experience for all involved. 

Better would be for courts to have a set of scientifically agreed prin- 
ciples that lay out the consensus opinion on some topics, and where 
there is reasonable doubt on others. Judges typically get such a primer 
when trying a patent dispute. Both sides allow their expert witnesses 
jointly to present points on which they agree, and which will not be 
disputed. This effectively sets a baseline for the 
ensuing arguments, which can still diverge sig- 
nificantly. These primers are useful, but only 
for specific cases. When the case they were pre- 
pared for is settled, the primer normally becomes 
redundant. 

It’s not realistic for primers to be prepared 
individually for every case, but perhaps they 
could be created for topics that recur — and 
that are argued about each time. Scientific issues 
arise in a substantial number of cases, certainly 
enough to justify a primer that could be applied 
to many of them. These could be broken down 
into four themes. For example, forensic-science 
primers could detail how crime-scene samples 
can be matched to DNA profiles and how mixed 
profiles can be disentangled. Pure-science prim- 
ers could explain how computer memories can 
be accessed and interpreted, and good-practice primers could lay out 
appropriate medical treatments and techniques. Scientific-method 
primers could set out the use and reliability of statistics. 

From my experience, such primers would be hugely beneficial. They 
should set out the current generally accepted facts and opinion in each 
area, and be written as far as possible in accessible language (as all 
evidence in a court should be, but alas not always is). 

Such primers would save money and time because the issues they 
detail would not realistically be open to challenge. They would also 
help in assessing the reliability of expert witnesses who give evidence 
on such issues, and they would increase the proportion of cases that 
are settled without a trial. The fact that opinions that are generally 
accepted in the scientific world sometimes turn out to be wrong is 
no barrier to this proposal. It is an inherent risk 


r | Aestimony from expert witnesses — and I have heard a lot in my 


in giving and weighing up scientific evidence. NATURE.COM 
The legal process is not static, and courts are _ Discuss this article 

already working on new ways to test the evi- _ online at: 

dence of expert witnesses. Many critics think — go.nature.com/39ebk8 


PRIMERS WOULD 


SAVE MONEY 
AND TIME 


BECAUSE THE ISSUES 

THEY DETAIL WOULD 

NOT REALISTICALLY 
BE OPEN TO 


CHALLENGE. 


Stop needless dispute of 
science in the courts 


Primers on various scientific topics could be used across trials to avoid 
wasting time on debating basic points, argues David Neuberger. 


that formal cross-examination risks the court favouring a more-fluent 
witness or a cleverer cross-examiner rather than the best evidence. 
One possibility, already being adopted as an alternative to cross-exam- 
ination in some civil litigation, involves so-called concurrent evidence 
(or ‘hot-tubbing’ as it is colloquially known), in which the experts and 
lawyers sit around a table and discuss the issues at a relatively informal, 
if structured, meeting that is chaired and led by the judge. The scien- 
tific primers that I have suggested build on this approach. 

How could they be prepared? It would require identifying areas of 
expertise in which a primer would be helpful and feasible, and then 
getting a group of acknowledged experts to formulate the guidance 
in that area. It would also, I think, be necessary for the group to moni- 
tor the primer, to take into account both how it is working and what 
advances are being made in the area. 

This would involve the legal and scientific 
communities working closely together, which is 
already starting to happen. As part of broader 
discussions, I and other senior judges are talking 
to scientists and officials at the Royal Societies of 
London and Edinburgh on how they could help 
us to prepare primers. We hope to announce 
some progress soon. 

The law has much to learn from science, in 
terms of both scientific thinking and discoveries 
and inventions. Scientific thinking is inevitably 
different from legal thinking — the idea of what 
constitutes proof and the role of common sense 
are two examples of divergence. But, given the 
importance of experience, logic and humanity 
in both spheres, legal and scientific thought have 
much in common as well. 

As for scientific advances, they interrelate with law both specifically 
(patents, for example) and generally (DNA evidence). And, as scientific 
research improves our understanding of the brain and mental pro- 
cesses, science will have even more to offer the law on issues such as 
mental capacity, the extent of pain and the reliability of memory. 

Itis not a one-way relationship. As scientific discoveries and inven- 
tions continue to move into ethically controversial territory, the law 
will be able to provide a clear and robust framework to accommodate 
such developments. Two examples include the relationship between 
surveillance and privacy, and genetic engineering. 

More broadly, lawyers and scientists who learn from each other’s 
expertise and experience can benefit society as a whole. Such a rela- 
tionship of mutual cooperation is one of which I am sure that Francis 
Bacon, the remarkable jurist, scientist and essayist who died 390 years 
ago, would have wholeheartedly approved. = 


David Neuberger is president of the UK Supreme Court in London. 
e-mail: jackie.sears@supremecourt.uk 
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Interbreeding in 
ancient Africa 


Fossils from Africa exhibit 
features of both modern 
humans and archaic species 
— possible evidence for 
interbreeding. However, 
finding a genetic legacy of such 
encounters has been difficult 
because of a lack of ancient 
human DNA from Africa. 
Michael Hammer at 
the University of Arizona 
in Tucson and his team 
analysed the entire genomes 
of seven individuals from 
two contemporary Western 
African Pygmy groups. The 
researchers identified 265 
regions of the Pygmy genomes 
that may have been acquired 
through ancient interbreeding 
with other species, events 
that could have happened as 
recently as 9,000 years ago. 
Genome Res. http://doi.org/bct3 
(2016) 


Gene lets animals 
tell left from right 


A gene defines left and 
right during embryonic 
development in snails 
and frogs. 

Animals generally look 
symmetrical, but internal 
organs are often positioned 
asymmetrically. To find out 
how embryos first define left 
and right at the molecular 
level, Angus Davison at the 
University of Nottingham, UK, 


Selections from the 
scientific literature 


Hearing is fearing for raccoons 


Fear of predators can trigger a cascade of effects 
through an ecosystem. 

Humans have eliminated most predators of 
raccoons (Procyon lotor; pictured) — such as 
cougars and wolves — from the Gulf Islands of 
British Columbia in Canada. This has allowed 
the raccoons to forage almost freely on shoreline 
species such as crabs and fish. To instil fear in 
the raccoons, Justin Suraci at the University of 
Victoria in Canada and his colleagues broadcast 
dog vocalizations over various island shores 


for one month. They found that foraging by 
racoons decreased drastically at these locations 
compared to areas where seal vocalizations were 
broadcast. This caused the number of some 
crabs to increase by up to 97% and numbers of 
some fish to rise by 81%. The snail prey of one 
crab species saw declines. 

This manipulation of fear shows the 
cascading effects of losing large predators from 
ecosystems, the authors say. 

Nature Commun. 7, 10698 (2016) 


and his colleagues compared 
the DNA of pond snails 
(Lymnaea stagnalis; pictured) 
that had shells with clockwise 
or anti-clockwise spirals. 
They found that formin, a 
cell-structure protein, was 
consistently linked to spiral 
direction and is expressed 
early in snail development, 
showing asymmetry even in 
two-cell embryos. 

The team treated frog 
embryos (Xenopus laevis) 
with anti-formin drugs, and 
found that 13% developed an 
organ on the opposite side to 
its normal position, suggesting 
that formin also coordinates 
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this process in frogs. 
Curr. Biol. http://dx.doi.org/ 
10.1016/j.cub.2015.12.071 (2016) 


Where climate 
models fall short 


Climate models tend to 

overestimate the extent 

to which climate change 

contributes to weather events 

such as extreme heat and rain. 
Omar Bellprat and Francisco 

Doblas-Reyes at the Catalan 

Institute of Climate Sciences 

in Barcelona, Spain, used an 

idealized statistical model 
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to compare the frequency 

of weather extremes in 
simulations with and without 
climate warming. Extreme 
events seemed to be more 
closely linked to climate change 
when the model was forced to 
run at low levels of reliability 
than when the model error was 
kept toa minimum. 

To account for models’ 
biased representation of climate 
variability, studies should rely 
on calibrated model ensembles, 
which are commonly used by 
weather forecasters, the authors 
suggest. 

Geophys. Res. Lett. http://doi. 
org/besr (2016) 
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Light rewrites 
memories of place 


Researchers have used light 

to disconnect the memory of 
an experience from that of the 
location where it occurred. 

‘Place’ cells in the brain 
fire when the animal is ina 
particular location, helping it to 
remember that place. Stéphanie 
Trouche and David Dupret at 
the University of Oxford, UK, 
and their colleagues wired up 
the brains of mice to monitor 
these cells and switch them 
off using light. When the 
researchers switched off the 
place cells that were associated 
with one of two differently 
shaped enclosures, they found 
that a group of previously silent 
place cells fired instead to 
encode that location. 

The team injected mice in 
one enclosure with cocaine, 
which made the animals 
prefer that location. When 
the initially active place cells 
were switched off, the mice no 
longer sought out the cocaine- 
linked enclosure — yet behaved 
as if it were still familiar. 

Nature Neurosci. http://doi.org/ 
besv (2016) 


CHEMISTRY 


Catalyst for clean 
drinking water 


An efficient and affordable 
catalyst could improve access 
to clean drinking water in 
remote areas. 

Hydrogen peroxide is 
commonly used to treat water 
and asa disinfectant, but it 
is synthesized in large-scale 
facilities at high concentrations 
that require dilution before 
use. Simon Freakley and 
Graham Hutchings at Cardiff 
University, UK, and their 
colleagues created a series of 
catalysts that can be used to 
make small batches of diluted 
hydrogen peroxide directly 
from hydrogen and oxygen. 

An earlier version of 
their catalyst used gold 
and palladium supported 
on activated carbon. Their 
latest version replaces gold 


with cheaper materials, 
including tin, zinc and nickel, 
but it maintains the same 
high reaction efficiency of 
more than 95%, and uses 
commercially available 
support materials such as 
titanium dioxide. 

Science 351, 965-968 (2016) 


Immune changes 
drive metastasis 


Quantifying the number of 
cancer-fighting immune cells 
that a tumour contains could 
offer a way to predict whether 
it will spread through the body. 

Cancer is often deadly when 
it spreads, but anticipating 
primary-tumour spread (or 
metastasis) is difficult. Jér6me 
Galon at the French National 
Institute of Health and Medical 
Research in Paris and his 
team analysed tumours from 
more than 800 people with 
colorectal cancer, comparing 
people whose tumours had 
metastasized with those whose 
had not. The primary tumours 
from both groups had similar 
patterns of mutations in cancer 
genes, but tumours that had 
spread had fewer cell-killing 
T cells. The invasive edges 
of the metastasized tumours 
also had a lower density of 
lymphatic vessels, which 
transport immune cells. 

The authors conclude that 
these changes contribute 
to metastasis, and that 
immunotherapies that boost 
T-cell responses could block 
the spread of cancer in people 
with early-stage disease. 
Science Transl. Med. 8, 327ra26 
(2016) 


COSMOLOGY 


Missing matter 
may hide in voids 


As much as 30% of the 
Universe’s observable matter 
could be hiding in enormous 
cosmic voids, where it is too 
sparse for scientists to observe. 
Matter in the nearby 
Universe is said to be missing 
because astronomers have 
failed to see as much material 


RESEARCH HIGHLIGHTS MiiiSaiaa¢ 


SOCIAL SELECTIO 


Popular topics 
on social media 


How many replications are enough? 


When psychologist Courtenay Norbury came across a paper 

in Research in Developmental Disabilities this week that 

had similar conclusions to research she published 12 years 

ago, she turned to social media with a question. Norbury, 

who studies children with autism spectrum disorders at 

University College London, tweeted: “How many times 

does a research finding need to be replicated before the field 

says ‘ok, how do we move this forward?” Dorothy Bishop, 

a developmental neuropsychologist at the University of 

Oxford, UK, who helped to write a report on how to improve 

the reliability of biomedical research, tweeted in response 
that some fields can get stuck on the 


> NATURE.COM 
For more on 

popular papers: 
go.nature.com/Syaf4d 


as observations of the early 
Universe suggest there should 
be. To map how matter might 
be distributed, Markus Haider 
at the University of Innsbruck 
in Austria and his team used 

a simulation for how galaxies 
and intergalactic filaments 
evolved. This modelled the 
behaviour of both normal and 
dark matter — an invisible 
substance detected only by its 
gravitational pull — in acube 
of space 350 million light years 
(about 107 million parsecs) 
across. 

Analysis of the model, 
known as IIlustris, suggests 
that the energy of radiating 
supermassive black holes has 
flung as much as 24% of normal 
matter out of galaxies and into 
voids, where an extra 6% that 
has yet to gather in filaments 
also lies. This could help to 
explain some of the missing 
matter, say the authors. 

Mon. Not. R. Astron. Soc. 457, 
3024-3035 (2016) 


Sample reveals 
Antarctic history 


The Antarctic ice sheet 
retreated inland millions of 
years ago, when atmospheric 
carbon dioxide levels were not 
that much higher than they 
are now. 


same research questions: “The opposite 
of the reproducibility crisis! Stasis. And 
yup it’s a problem in some areas.” 

Res. Dev. Disabil. http://doi.org/bctr (2016) 


A team led by Richard Levy 
of GNS Science in Lower 
Hutt, New Zealand, analysed 
a drill core of sediment 
from McMurdo Sound, 
Antarctica, to reveal climate 
history between 21 million 
and 13 million years ago. The 
greatest ice-sheet shrinkage 
was seen when CO, levels were 
500 parts per million or more: 
today’s level is about 400 p.p.m., 
and rising. The researchers 
conclude that Antarctica 
(pictured) may respond more 
quickly to changing CO, levels 
than once thought. 

A related study by some of 
the same authors modelled 
how the Antarctic ice sheet 
responded to shifts in climate 
and found similar changes. 
Proc. Natl Acad. Sci. USA http:// 
doi.org/bev4 (2016); http://doi. 
org/bev5 (2016) 
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How many replication studies are enough? 


Researchers on social media ask at what point replication efforts go from useful to wasteful. 
Dalmeet Singh Chawla 
26 February 2016 


When psychologist Courtenay Norbury came across a paper this week that had similar conclusions to research she published 
12 years ago, she turned to social media with a question. Norbury, who studies children with autism spectrum disorders at 
University College London, tweeted: 

How many times does a research finding need to be replicated before the field says "ok, how do we move this forward?" 


— Courtenay Norbury (@lilacCourt) February 23, 2016 


Dorothy Bishop, a developmental neuropsychologist at the University of Oxford, UK, who helped to write a report on how to improve 
the reliability of biomedical research, tweeted in response that some fields can get stuck on the same research questions: 


the opposite of the reproducibility crisis! Stasis. And yup it's a problem in some areas https://t.co/tzX MUQ6h8C 
— Dorothy Bishop (@deevybee) February 23, 2016 

The problem of irreproducibility in science has gained widespread attention, but one aspect that is 
discussed less often is how to find the right balance between replicating findings and moving a 


field forward from well-established ones. Norbury has for years reported that people who have 
autism and poor language skills can find it difficult to make inferences, decipher ambiguous 


phrases and understand metaphors or jokes. She has also shown that this is not necessarily the 
case for people with autism who are more linguistically capable’: 2. The latest study, published in 
Research in Developmental Disabilities on 18 February?, mirrored the work by suggesting that 
people with autism who have good language skills do well on such tasks. Social Selection 

Nature’s snapshot of science on 


j j j j j social media 
Psychologists Melanie Eberhardt at the University of Cologne in Fisintadsterine 


Germany and Aparna Nadig at McGill University in Montreal, 


Canada, authors of the latest study, acknowledge that this * Over half of psychology 


evidence is clear and convincing for researchers in this area, but Studie talline pro sucinility 


say: “We find that unfortunately there are many lay, professional lest 

and academic circles where this result is still not understood.” * Collaborate and listen to 
They add, “Therefore we found it important to add to the reproduce research 
convergent evidence on this question.” e Irreproducible biology 


research costs put at $28 
“Replication is important but it would be nice sometimes to take billion per year 
a bigger leap forward,” says Norbury. She adds that the next 
“obvious” step in this research involves intervention, which is More related stories 
challenging to do. “But I’d love to start thinking of how to 
overcome these obstacles, rather than just repeatedly demonstrating that language impairment has negative impacts for children 
with autism,” she says. 


Brett Buttliere, a research assistant at the Leibniz Institute for Knowledge Media in Tubingen, Germany, tweeted: 


@deevybee this is also a large problem in Psychology as well, in my opinion! :D 
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— Brett (@BrettButtliere) February 23, 2016 


“It is obvious that not publishing when something doesn’t work is bad, but so is doing the same thing over and over again without 
learning something new,” Buttliere said in an interview. 


Virginia Barbour, executive officer of the Australasian Open Access Support Group in Brisbane, Australia, posted her observations: 


@deevybee @lilacCourt I've seen meta analyses of clinical interventions which show that there was clinical certainty yet trials 
continued 


— Ginny Barbour (@GinnyBarbour) February 23, 2016 


In response, Bishop later tweeted a link to a paper published in The Lancet that looked at reducing waste in biomedical research‘. 
The 2014 article said that many studies are done without referencing systematic reviews of the literature, which leads to waste. 


Paul Glasziou, a clinician and researcher at Bond University in Queensland, Australia, who is a co-author of the Lancet paper, says 
that the bar for reproducibility can be set at different heights. For example, he says, the US Food and Drug Administration requires 
a minimum of two positive randomized control trials to show effectiveness of a new drug — a rule that Glasziou says is “reasonable” 
as long as the trials are “well done and adequately powered”. He adds, however, that clinical studies are usually repeated more 
than once, pointing to one analysis of systematic reviews that found that a review cited on average 16 similar papers ®. 


One issue could be that researchers are unaware of similar studies. For example, a study of 1,523 clinical-trial reports published 
between 1963 and 2004 found that, on average, each report cited less than 25% of the previous similar trials that were relevant®, 


Barbour added in an interview that one way of deciding whether a claim has been reproduced enough could be to analyse reviews 
and meta-analyses of studies from the same field. 


Using novelty as a criterion for publication in journals may solve the problem, notes Norbury, who is an editor of the Journal of Child 
Psychology and Psychiatry. By contrast, many researchers have called for journals to de-emphasize new findings and instead 
publish numerous replications; Norbury says that she agrees “to a certain extent”. 


But, she adds, “my heart sometimes sinks when | see yet another paper exploring what | consider to be well-trodden ground. I’m 
not looking for ‘sexy’ findings, but | am looking for something that has the potential to change practice or move the field forward.” 


Nature 531, 11 (03 March 2016) = doi:10.1038/53101 1f 
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Related stories and links 


From nature.com 


¢ Over half of psychology studies fail reproducibility test 
27 August 2015 

e Collaborate and listen to reproduce research 
16 July 2015 

¢ Irreproducible biology research costs put at $28 billion per year 
09 June 2015 

e Reproducibility special 


For the best commenting experience, please login or register as a user and agree to our Community Guidelines. You will be re- 
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users. 
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Douglas Eckberg » 2016-03-02 08:47 PM 

My heart always sinks when | find people deciding it is time to micromanage others' work. The "right balance" indeed! Either 
the researchers of a "replication," as well as its reviewers and journal editors, are not aware of someone else's publication, 
in which case the importance of the original piece is probably not as great as the original writers think, or researchers, 
reviewers, and editors have all agreed that a fresh look is in order. In any case, this is an attempt at a "solution" without a 
clear "problem." 


Leonid Schneider : 2016-02-29 12:39 PM 

"Using novelty as a criterion for publication in journals", as Norbury suggests, is not such a novel idea. Most journals use this 
as selection criteria, with the exception of PLOS One and some others. Such "novelty"-driven approach puts the research 
outcome as more valuable than the research itself, and this is not at all scientific. One should not be surprised then that 
such fraudulent nonsense as STAP ended up in Nature: the result was very novel indeed, even if the science behind it was 
bonkers, and fake on top. Instead, originality of research should be assessed. This would focus on the original 
rationalisation, experimental setup, quality of analysis and scientific coherence (as opposed to dull and thoughtless re- 
pipetting of long published research). Judging the true impact of the results of these experiments should be left to peers and 
future scientists, not to the "goal-keeper" editors, even those elite wizards of Nature. 
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SEVEN DAYS nescnisn 


Ebola drug stutters 


Results from a clinical 

trial to test experimental 
Ebola treatment ZMapp 
failed to show statistically 
significant results. The 

drug, developed by Mapp 
Biopharmaceutical, based 

in San Diego, California, 
contains three antibodies 
and had shown promise in 
animal studies. According 

to results presented on 

23 February at the Conference 
on Retroviruses and 
Opportunistic Infections in 
Boston, Massachusetts, of 36 
people given ZMapp, 78% 
survived, compared with 
61% of 35 patients who did 
not receive the drug. Mapp 
was forced to end the clinical 
trial in January without 
achieving its goal of enrolling 
200 patients because of the 
waning of the Ebola outbreak. 


Tetraquark addition 


Scientists reported findings 
ofa new tetraquark on 24 
February. Elementary particles 
known as quarks usually bind 
together in groups of two 

or three, but physicists have 
observed some composed of 
four quarks. The new family 
member, called X(5568), 
emerged in data from the 
DZero experiment at the 
now-inactive Tevatron particle 
accelerator at Fermilab in 


NUMBER CRUNCH 


The drop in Japan’s 
population since 2010, 
according to the latest 
census. The population 

has fallen by 0.7%, to 

127.1 million. The decline is 
the first since records began. 
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Pollinators star in biodiversity report 


The Intergovernmental Science-Policy Platform 
on Biodiversity and Ecosystem Services 

(IPBES) announced the findings of its first 
report on 26 February. The review warns that 
an ongoing decline in the number of pollinating 
insects (pictured) and animals threatens global 
crop production, which depends on pollinators 


Batavia, Illinois. Unlike other 
examples of tetraquarks, all 
of which contain at least two 
quarks of the same type, or 
‘flavour; each of the quarks in 
X(5568) is different. Studying 
the particle could help 
physicists to understand more 
about the strong force, which 
holds atomic nuclei together. 


Gas leak quantified 
Some 97,100 tonnes of 
methane leaked out of 

an underground storage 
facility run by the Southern 
California Gas Company in 
Aliso Canyon, California, 
researchers reported on 

25 February. A team led by 
Stephen Conley, president 

of Scientific Aviation in 
Boulder, Colorado, measured 
methane concentrations above 
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the site during 13 aircraft 
flights between 7 November 
and 13 February. The team 
calculated that the methane 
release was equivalent to 

the annual greenhouse-gas 
emissions from 572,000 cars. 
The leak began on 23 October 
and lasted nearly four months. 


Sequencing suit 
Genome-sequencing giant 
Illumina said on 23 February 
that it has filed a lawsuit against 
UK-based Oxford Nanopore 
Technologies, the first 
company to commercialize 
nanopore sequencing. The 
technology reads single bases 
of genetic material as they pass 
through a nanoscale pore. The 
suit, by Illumina of San Diego, 
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and, as an industry, is worth up to US$577 
billion annually. According to the report, the 
decrease is fuelled by a multitude of factors, 
including climate change, disease and pesticide 
use. The IPBES, established in 2012, is modelled 
roughly on the Intergovernmental Panel on 
Climate Change. 


California, alleges that Oxford 
Nanopore has infringed on 
Illumina patents that describe 
aspects of using pores to read 
DNA. Oxford Nanopore has its 
own suite of patents related to 
the technology. See go.nature. 
com/7hydeg for more. 


Chagas scoop 


KaloBios Pharmaceuticals 

of San Francisco, California, 
is poised to acquire sole 
distribution rights for a 
version of benznidazole, one 
of only two drugs that can 
treat the insect-borne parasite 
that causes Chagas disease, 
after a bankruptcy court ruled 
in its favour on 26 February. 
In December, the then chief 
executive Martin Shkreli 
announced that the company 
would price the drug on a level 
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US COASTGUAI 


SOURCE: ACADEMY OF SCIENCE OF SOUTH AFRICA; INTERACADEMY PARTNERSHIP. 


z with hepatitis C antivirals, 
3 which cost up to US$100,000 
per treatment. 


PRIZES 
Memory work wins 


Three British neuroscientists 
share this year’s Brain Prize 
for their work on how 
memories are formed and 
lost in the brain. Using 
different approaches, Timothy 
Bliss, visiting worker at the 
Francis Crick Institute in 
London, Richard Morris at 
the University of Edinburgh, 
UK, and Graham Collingridge 
at the University of Bristol, 
UK, have shown over the 
past four decades how a brain 
mechanism called long-term 
potentiation underpins the 
ability to learn and remember 
by strengthening connections 
between particular neurons. 
The €1-million (US$1.1- 
million) prize was awarded 
on 1 March by the Grete 
Lundbeck European Brain 
Research Foundation in 
Denmark. 


EVENTS 


Out of deep water 
A US jury has acquitted a 
BP site manager who was 

in charge of the Deepwater 
Horizon drilling platform 
during the disastrous spill in 
2010, which led to 11 deaths 
and leaked huge amounts of 
oil into the Gulf of Mexico 


TREND WATCH 


The first global survey of womens 


representation at the highest 


level of academia shows that just 
12% of members of 69 academies 
surveyed in 2013-14 are female. 
The Cuban Academy of Sciences 
had the highest proportion (27%), 
whereas the Tanzania Academy of 
Sciences and the Polish Academy 
of Sciences had the lowest levels, 
at 4%. Only 40% of the academies 


had policies that explicitly 


mention the need for increased 
participation of women in the 
academy’s activities. See go.nature. 


com/cwigqv for more. 


(pictured). According to US 
media reports, Robert Kaluza 
faced a criminal charge related 
to the ensuing pollution, but 
was acquitted by ajury in 
New Orleans, Louisiana, on 
25 February. Kaluza was one 
of the last BP defendants to 
face charges over the incident, 
although the company still has 
to pay billions in fines. 


Chemistry petition 
Chemists are petitioning the 
chancellor of the University of 
California, Berkeley, to secure 
the future of the institution’s 
College of Chemistry. More 
than 3,000 people had signed 
the petition as Nature went 

to press. Berkeley chancellor 
Nicholas Dirks announced 

a “strategic planning 

process” on 10 February, 

to try to find solutions to 

the university's “substantial 
and growing structural 
deficit”. A spokesperson for 
the university told Nature 


that although the College 

of Chemistry could be 
dissolved asa result of this, no 
decisions have yet been taken 
and Berkeley is committed 

to chemistry research and 
teaching. 


Italian protests 


Researchers held a protest at 
the Sapienza University of 
Rome on 25 February, calling 
the Italian government's 
support for research 
insufficient and erratic. The 
protest followed a 4 February 
Correspondence in Nature 
by Sapienza physicist Giorgio 
Parisi (G. Parisi Nature 530, 
33; 2016) that was supported 
by 69 researchers. A petition 
to the Italian government and 
the European Union started 
by Parisi had almost 55,000 
signatures as of 1 March. 
Italy spends 1.25% of its 
gross domestic product on 
research, but the petition 
says that the EU should 


WOMEN IN SCIENCE ACADEMIES 


Most science academies across the world have more than 
80% male membership. 
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SEVEN DAYS | THIS WEEK | 


7-11 MARCH 

The United Nations and 
Costa Rica Workshop 
on Human Space 
Technology convenes in 
San Jose, Costa Rica. 
go.nature.com/swmviz 


8-10 MARCH 

Seattle, Washington, 
hosts the Climate 
Leadership Conference, 
to discuss US climate 
policy and innovation 
in the wake of the Paris 
agreement. 
go.nature.com/pwplig 


10 MARCH 

The US Patent and 
Trademark Office 
starts proceedings over 
who holds the rights 

to commercialize 
CRISPR-Cas9 gene- 
editing technology. 
go.nature.com/qvnsn8 


require governments to set 
aminimum of 3%, as the 
EU Council of Ministers has 
advocated in the past. 


| FUNDING 
India funds science 


In its annual budget released 
on 29 February, the Indian 
government increased funding 
for the Department of Science 
and Technology (DST) by 17% 
from last year, to 44.7 billion 
rupees (US$650 million). 

The DST is India’s main 
funding agency and will use 
the money to initiate research 
programmes on energy, water 
and biomedical devices. The 
Department of Biotechnology 
received 18.2 billion rupees, 

a 12% rise. But news was 
mixed for other divisions: 

the Department of Health 
Research's budget represented 
a 12% rise compared to 
2015-16, whereas the 
Department of Space got an 
increase of less than 2%. 
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Speedier Arctic data as warm winter shrinks sea ice 


Scientists push for better monitoring of what remains. 
Alexandra Witze 


01 March 2016 


UPI/Benjamin Nocerini/US Coast Guard/Eyevine 


Researchers have found a way to track changes in sea-ice thickness in near real time. 


Following a record winter in many ways, Arctic sea-ice cover seems poised to reach one of its smallest winter maxima ever. As of 
28 February, ice covered 14.525 million square kilometres, or 938,000 square kilometres less than the 1981-2010 average. And 
researchers are using a new technique to capture crucial information about the thinning ice pack in near real time, to better forecast 
future changes. 


Short-term weather patterns and long-term climate trends have conspired to create an extraordinary couple of months, even by 
Arctic standards. “This winter will be the topic of research for many years to come,” says Jennifer Francis, a climate scientist at 
Rutgers University in New Brunswick, New Jersey. “There’s such an unusual cast of characters on the stage that have never played 
together before.” 


The characters include the El Nifio weather pattern that is pumping heat and moisture across the globe, and the Arctic Oscillation, a 
large-scale climate pattern whose shifts in recent months have pushed warm air northward. Together, they are exacerbating the 
long-term decline of Arctic sea ice, which has shrunk by an average of 3% each February since satellite records began in 1979. 


A persistent ridge of high-pressure air perched off the US West Coast has steered weather Ralated stories 


systems around drought-stricken California, funnelling warmth northward. As a consequence, sea 


ice is particularly scarce this year in the Bering Sea. “The ice would normally be extensive and ee cold snap fuels 


cold, but we have open water instead,” says Francis. climate Gepate 


e Summer storms bolster 


A storm last December compounded the situation by pushing warm air — more than 20 °C above Arctic ice 
average — to the North Pole. In January, an Arctic Oscillation-driven warm spell heated the air * Special issue on the 
above most of the Arctic Ocean. By February, ice had begun to circulate clockwise around the Arctic: After the ice 
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Arctic basin and out through the Fram Strait, says Julienne Stroeve, a researcher at the US More related stories 
National Snow and Ice Data Center (NSIDC) in Boulder, Colorado. 


Given the Arctic’s notoriously unpredictable weather, the low maximum doesn’t necessarily foretell record-low melting this summer, 
when sea ice will reach its annual minimum. (The biggest summer melt on record happened in 2012, a year without an El Nifio.) But 
researchers have one new tool with which to track the changes as they happen this year — the first detailed, near-real-time 
estimates of ice thickness, from the European Space Agency’s CryoSat-2 satellite. 


Three research groups currently calculate Arctic ice thickness from satellite data, but with a lag time of at least a month. Faster 
estimates would allow shipping companies to better plot routes through the Arctic, and scientists to improve their longer-term 
forecasts of ice behaviour. “The quicker you have these estimates of sea-ice thickness, the quicker you can start assimilating them 
into models and make more timely predictions of what’s going to happen,” says Rachel Tilling, a sea-ice researcher at University 
College London. 


She and her colleagues have developed a faster way to get information on ice thickness from CryoSat-2 (see ‘Measuring stick’). 
The satellite measures thickness by comparing the time that it takes for radar signals to bounce off the ice, as opposed to open 
water. Normally, it takes several months for satellite operators to calculate Cryo-Sat-2’s precise orbit (and therefore the exact 
location of the ice and water that it flew over). But Tilling’s group instead runs a quick-and-dirty analysis of orbital data, then 
combines it with near-real-time information on ice concentration from the NSIDC and ice type from the Norwegian Meteorological 
Service (R. L. Tilling et al. Cryosphere Discuss. http://doi.org/bcw5; 2016). 
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MEASURING STICK 


Radar data from the CryoSat-2 probe can now be 
used to track sea-ice thickness in near real time. 


CryoSat-2 


a 


The probe can distinguish 
between the height of sea 


ice and that of the 
surrounding ocean surface. 
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The result is ice-thickness measurements that are ready in just 3 days, and accurate to within 1.5% of those produced months later. 


The current winter cycle is the first complete season for the near-real-time data. (The measurements cannot be done in the 


summer, when melt ponds on the ice confuse the satellite.) 
Tilling has begun to speak to shipping companies, among others, that are interested in using the data as fast as they are produced. 


“It really is a new era for CryoSat-2,” she says. 
More-accurate ice-thickness data would improve climate models and give better forecasts for the possible impacts of thick or thin 
sea ice, says Nathan Kurtz, a cryosphere scientist at NASA’s Goddard Space Flight Center in Greenbelt, Maryland. Kurtz helps to 


lead NASA’s IceBridge project, which will begin flying aeroplanes north of Greenland later this month to measure ice thickness 


using lasers and an infrared camera that can detect heat from the underlying water. 


Thickness measurements are more crucial than ever, given the changing Arctic, says David Barber, a sea-ice specialist at the 
University of Manitoba in Winnipeg, Canada. He and his colleagues reported last year that there is increased open water all around 
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the edge of the Arctic ice pack every month of the year (D. G. Barber et al. Prog. Oceanogr. 139, 122-150; 2015). 


“We're getting more open water in the winter than we were expecting,” Barber says. “These changes are happening very quickly, 
and | don’t think people are fully aware of how dramatic they are.” 
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India’s budget keeps dream of genomics hub alive 
Biotechnology agency wants to upgrade capabilities to kick-start economic growth. 

T. V. Padma 


29 February 2016 Updated: 01 March 2016 


Adnan Abidi/TPX/Reuters 


India’s finance minister Arun Jaitley arrives at the parliament to present the latest federal budget, which 
contained a moderate boost for biotechnology. 


An ambitious plan to turn India into a world-class centre for genomics research and commercialization 
received a modest boost on 29 February when the government announced its annual budget. 


A big winner in the budget was India’s main science funding agency, the Department of Science and 

Technology, which received 44.7 billion rupees (US$650 million), a 17% hike on last year’s allocation. 
The Department of Biotechnology (DBT), meanwhile, received 18.2 billion rupees, a 12% rise on the 

previous year — an indication of how the National Biotechnology Development Strategy, which the 


department unveiled last December, is likely to evolve. 


The budget brought mixed news for other departments engaged in scientific research (see ‘Budget 
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allocations’). The Department of Health Research’s budget represented a 12% rise in funding compared 
with the money pledged in 2015-16, for instance, whereas the Department of Space got an increase of 
less than 2%. 


Budget allocations (millions of rupees) 


2015-16 2015-16 (revised) 2016-17 


Department of Atomic Energy* 109,120 113,840 116,825 
Defence Research & Development 143,585 124,912 135,938 
Ministry of Earth Sciences 16,197 14,180 16,724 
Ministry of New & Renewable Energy 3,032 2,621 50,3587 
Department of Science and Technology 38,357 38,286 44,702 
Department of Scientific and Industrial Research 40,310 40,367 40,628 
Department of Biotechnology 16,251 16,244 18,200 
Department of Space 73,880 69,594 75,091 
Department of Health Research 10,181 10,126 11,448 
Department of Agricultural Research and Education 63,200 55,860 66,200 


“Includes budget for operating nuclear power stations. tIncludes money from a clean energy levy, now renamed 


the Clean Environment Cess, in contrast to previous years. 


Source: Ministry-approved budget. 


The DBT’s allocation is roughly half of what Krishnaswamy VijayRaghavan, the department's secretary, 
estimated was needed when the strategy was released — but he is confident that the remainder can be 
made up from other sources. The allocation is “very good” in terms of implementing the strategy, he told 


Nature. 


The DBT’s strategy aims to ramp up India’s total biotech revenues by more than tenfold since the 
industry started up two decades ago, to $100 billion by 2025. The idea is to kick-start the economy by 
replicating the success of the information-technology boom that has fuelled economic growth for more 
than 20 years. “Biotechnology can be another vibrant model for growth that India can offer,” said the 
science minister, Harsh Vardhan, back in December. 


India’s pharmaceutical industry is largely confined to the production of ‘generic’ copies of existing drugs, 
and to contract research organizations, which conduct clinical trials of drugs and vaccines on behalf of 
pharmaceutical companies. Both have grown into successful industries, says VijayRaghavan, but the 
nation “is now ready to upgrade its biotech capabilities’. 


The DBT received an encouraging sign last year, when its budget allocation for 2015 was not 
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subsequently slashed during mid-term budget revisions, as often happens. The latest 12% increase 
builds on that good news, says VijayRaghavan. Ahead of this year’s budget, however, VijayRaghavan 
said that this figure would need to rise to 25-30% annually over the next five years to implement the 
strategy. He now says that the DBT could make up the difference with funds from elsewhere, including 
a national innovation mission launched in January, which identified biotechnology as a key area — in 
particular, to bolster the parts of the strategy aimed at nurturing biotech start-ups and supporting 
entrepreneurs. 


The DBT could also tap into the science department’s budget. Department secretary Ashutosh Sharma, 
who described the increase as “fantastic”, says that his department’s plans include a rise in the number 
and quality of start-ups and business incubators, support for scientists undertaking high-risk research, 
the promotion of industry-relevant research at academic institutes and new research programmes on, 
among other things, biomedical devices. 


Some of these plans overlap with the DBT’s strategy. 
Ahead of the budget announcement, VijayRaghavan 
told Nature that the strategy revolves around three 
core activities. The first is the creation of new 
infrastructure. India already hosts several genomics 
research institutes, such as the Institute of Genomics 
and Integrative Biology (IGIB) and the National 
Institute of Plant Genome Research, both in New 
Delhi, and the National Institute of Biomedical 
Genomics near Kolkata. The DBT’s strategy aims to 
create five more centres, each dedicated to a different 


field, including drug discovery, marine biology and 


infection, as well as several centres of excellence NCBS 
based on narrower, high-priority areas such as “India is now ready to upgrade its biotech 
genetically modified organisms, vaccines and marine capabilities.” Krishnaswamy VijayRaghavan, 
bioproducts. secretary of the Department of Biotechnology. 


The second activity is the provision of training in the analysis of big data. Indian geneticists have 
previously discovered mutations in breast-cancer genes that are unique to the Indian population 

(M. T. Valarmathi et al. Hum. Mutat. 23, 205; 2004) and sequenced the genomes of several crops, 
including chickpea (R. K. Varshney et al. Nature Biotechnol. 31, 240-246; 2013). The DBT’s strategy 
aims to train researchers in the scanning and analysis of large numbers of genomes. VijayRaghavan 
notes that the country’s existing expertise in computing and biology will help this. “We need to bring the 
two skill sets together,” he says. 


The strategy also aims to create 150 technology-transfer organizations to help to commercialize 
discoveries made in publicly funded research laboratories, together with 40 technology and business 
incubators, which will provide equipment and guidance for new firms, and facilitate networking 
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opportunities. Together, these all comprise the third core activity, says VijayRaghavan, and will build on 
the activities of the DBT’s Biotechnology Industry Research Assistance Council, which was set up in 
2012. 


Lipi Thukral, a computational structural biologist at the IGIB, sees the DBT’s goal of creating a genomics 
hub as an opportunity for India to enter the emerging field of precision medicine, which uses genomic, 
physiological and other data to tailor treatments to the individual. Achieving that will require clinicians to 
gather extensive data and to interact better with academics to analyse the data, says Thukral. To 
succeed, the strategy will also need a feedback mechanism between the biotech corporate sector and 
academic institutions, policies to protect scientists undertaking high-risk, high-reward research and a 
reorientation of academia towards more industry-relevant research, she adds. 


Others have reservations about the scale of the DBT’s ambitions. “The strategy seems overwhelming, 
overburdened, and implementation would be a Herculean task,” says Nalini Vemuri, vice-president for 
research and development at the Gurgaon-based company Lifecare Innovations. The strategy spans 
four major areas — health care, food and nutrition, energy and education. But although these represent 
an “impressive vision”, says Vemuri, a more narrowly focused goal, for example to improve the country’s 
research in malaria and tuberculosis, would have a better chance of success. 
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Ukraine has lost access to the Crimean Astrophysical Observatory. 


Conflicting laws threaten 
Ukrainian science 


Country’s austerity budget stands in way of law to modernize Soviet-era academy. 


BY QUIRIN SCHIERMEIER 


s political turmoil and conflict rock 
At the country’s main scientific 
organization is in a bind. In January, 
Parliament passed a law to modernize the ail- 
ing National Academy of Sciences of Ukraine 
(NASU). Yet an austerity budget imposed 
around the same time makes this impossible to 
achieve — at least this year. The resulting cuts 
to science funding threaten the jobs of young 
researchers in particular, who are best poised 
to revitalize the country’s failing economy. 
“We have an extraordinarily high number 
of potential young scientists who are ready to 
work for the welfare of the country,’ says Liliya 
Hrynevych, who chairs the Ukrainian Parlia- 
ment’s Committee on Science and Education 
and voted in favour of the modernizing law. 
“But without setting priorities for science and 
research, it will be impossible for Ukraine to 
become a strong and wealthy European nation.” 
The academy employs some 20,000 scientists 
across 120 research institutes. On 26 November, 
Parliament began to debate a “law of Ukraine on 
scientific and technical activity’, in an attempt 
to streamline and strengthen the organization, 
which was founded in the Soviet era. Long 
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deemed outdated and resistant to moderniza- 
tion, the academy uses an opaque system to 
award funding, and many of its members are 
elderly, not least the 97-year-old metallurgist 
Boris Paton, who has run the NASU for decades. 

The law stipulates the creation of a science 
advisory council that includes foreign special- 
ists, and an independent grant-giving agency. 
All NASU institutes will undergo an external 
evaluation to examine their productivity and 
efficiency, and overall, government science 


AILING ACADEMY 

Despite the introduction of a law aimed at 
modernizing the National Academy of 
Sciences of Ukraine, funding is in decline. 
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spending must increase from a current 0.3% of 
gross domestic product to at least 1.7% by fiscal 
year 2017 — near the European Union average. 

But before the law took effect, Ukraine 
passed its 2016 austerity budget, in the wake 
of widespread closure of mines and factories, 
inflation, debt and currency devaluation. The 
budget allocates a meagre 2.05 billion hryvnia 
(US$76 million) to the NASU — about 12% 
less than in 2015, continuing a trend of decline 
(see ‘Ailing academy’). 

The cutbacks are irreconcilable with the 
science law, says Hrynevych, who is campaign- 
ing in Parliament for a budget revision after the 
first quarter of 2016. The budget will leave the 
academy with scarcely enough to cover the scant 
salaries (about US$200 per month on average) 
paid to its administrative staff and scientists. 

“We wont be able to buy any new equipment 
this year, and purchase of consumables will 
need to be reduced to a minimum,” says 
Anatoly Zagorodny, director of the Bogolyubov 
Institute for Theoretical Physics in Kiev anda 
vice-president of the academy. 

The fresh cuts, he says, will also force 
institutes to reduce staff — in some circum- 
stances, by more than one-third — and to dis- 
continue many areas of research, even though 
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science is crucial to economic recovery, he adds. 

Young scientists are the least protected by 
existing labour laws and so will feel the brunt of 
the job cuts, says Irina Yehorchenko, a research 
fellow at the NASU’s Institute of Mathematics 
in Kiev. She and some of her colleagues 
launched a petition in December calling on 
the country’s president, Petro Poroshenko, to 
save Ukrainian science. 


YOUTHFUL POTENTIAL 

“I, for one, might be able to find a postdoc 
position abroad,” says Oleksandr Skorokhod, a 
cell biologist at the NASU Institute of Molecular 
Biology and Genetics in Kiev who is chair of the 
academy’s Council of Young Scientists. “But I'd 
much rather stay and try to change the bad state 
of affairs in my country.’ 


Ukrainian science has struggled to recover 
from Russias annexation of the Crimea 
peninsula in 2014. General consensus in the 
international community is that Crimea is still 
part of the Ukraine — the United Nations Gen- 
eral Assembly declared invalid a March 2014 
referendum in which voters in Crimea approved 
the peninsula's secession from Ukraine. 

But all 22 Crimean institutes formerly run 
by the NASU are now under Russian control, 
and only a few of their 1,320 staff members have 
relocated to Ukraine-controlled territory. The 
academy lost access to its only research ship, the 
RV Professor Vodianytsky, three astronomical 
observatories in Nauchny, Katsiveli and Yevpa- 
toria and the 204-year-old Nikitsky Botanical 
Garden near Yalta, on the Black Sea shore. 

The Ukrainian government, moreover, 
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expects scientists in Ukraine to cut all ties with 
colleagues who stayed on the peninsula, says 
Hrynevych, because any collaboration would be 
viewed as legitimizing the Russian occupation. 

The armed conflict with pro-Russian 
militants in eastern Ukraine is also causing 
problems for scientists, especially in the coun- 
try’s Donbas region. Some 12,000 scientists and 
university lecturers there — about 60% of the 
former staff of 26 research institutes and uni- 
versities in the province — have moved to safe 
institutions in Kiev and elsewhere. But many 
evacuating scientists left behind equipment or 
lost irreplaceable research material. Marine, 
environmental and climate studies in the Black 
Sea region, mining-related geology and a variety 
of archaeological and historical research have all 
been hit hard, says Zagorodny. = 


PUBLIC HEALTH 


Spectre of Ebola 
haunts Zika response 


Agencies rush to show that outbreak tactics have improved. 


BY ERIKA CHECK HAYDEN 


ublic-health workers are still struggling 
to stamp out the Ebola epidemic in West 


Africa. But the lessons learned from that 
outbreak — which exposed major flaws in the 
global public-health system — are shaping the 
escalating international response to the spread 
of Zika virus in the Americas. 

“Ebola is the gorilla in the room,” says 
Lawrence Gostin, a health-law and policy 
specialist at Georgetown University in 
Washington DC. “Tt’s driving everything” 

He and others say that governments and 
international public-health agencies seem 
determined not to repeat the main mistake 
that they made with Ebola: waiting for much 
too long to respond to a brewing outbreak. The 
delay allowed Ebola to grow so out of control 
in West Africa that the epidemic there persists 
after more than 2 years and 11,000 deaths. 

By contrast, the global health community 
has moved aggressively against Zika, begin- 
ning with a declaration from the World Health 
Organization (WHO) on 1 February that the 
clusters of microcephaly and other neuro- 
logical disorders that have appeared in Brazil 
coincident with outbreak of the virus, and 
previously in French Polynesia, constitute an 
international public-health emergency. 

The WHO has never yet made such a decla- 
ration before knowing the cause of the condi- 
tion of concern. The August 2014 declaration 


that Ebola was a public-health emergency came 
after the disease had been spreading in West 
Africa for 8 months and had killed 932 people. 
But although Zika has probably infected as 
many as 1 million people in the latest outbreak, 
the vast majority have recovered. And scien- 
tists have not proved a link between Zika and 
microcephaly, a condition in which infants are 
born with abnormally small heads and brains. 
“The WHO has perhaps gotten out ahead 
of its usual position of gathering and verifying 
all the evidence before taking a clear position,” 
says Adam Kamradt-Scott, a health-security 
specialist at the University of Sydney in 
Australia. “The WHO couldn’ afford to be 
seen to be asleep at the wheel a second time,’ 
Other authorities have taken similarly bold 
action. On 3 February, the US Centers for Dis- 
ease Control and Prevention (CDC) moved its 
emergency-response operations centre to its 
highest activation level, jump-starting US gov- 
ernment research into, and surveillance of, the 
Zika virus. On the same day, the United King- 
dom announced the creation of a Zika research 
fund with an initial budget of up to £1 million 
(US$1.4-million). And on 8 February, US Presi- 
dent Barack Obama requested $1.8 billion from 
lawmakers for Zika-response activities. (By 
comparison, Obamas $6.18-billion request for 
Ebola-response funding came 3 months after 
that virus was declared a global emergency.) 
The ongoing mobilization against Zika 
is not an over-reaction, says Suerie Moon, 
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a global-health researcher at the Harvard 
T. H. Chan School of Public Health in Boston, 
Massachusetts. Although Zika — unlike Ebola 
— is not usually fatal, it has the potential to 
cause suffering and social and economic havoc. 
“T's encouraging to see leadership and mobiliza- 
tion from WHO, CDC and other public-health 
institutions,’ Moon says. “It shows that some of 
the lessons from Ebola have been digested” 
WHO director-general Margaret Chan has 
acknowledged the agency’s failings on Ebola, 
citing “inadequacies and shortcomings in this 
organization's administrative, managerial and 
technical infrastructures” in a speech last year. 
The Zika response also highlights persistent 
flaws of the global public-health system. Zika 
was first discovered in Africa in 1947, and 
caused a major outbreak in 2013 in the Pacific 
islands, but there is still no vaccine, treatment 
or common diagnostic test for the virus. 
Kamradt-Scott wonders if the world would 
be tracking Zika’s spread so closely if the virus 
had not emerged in Brazil, where hundreds of 
thousands of tourists are scheduled to attend the 
Olympic Games in August. “My own percep- 
tion is that the international community hasn't 
responded particularly swiftly to Zika,” he says. 
Moon notes that although the WHO is try- 
ing to ensure that researchers in government, 
academia and industry share data on the out- 
break, drug companies developing Zika vac- 
cines have not publicly agreed to participate. 
The WHO has long struggled to modulate 
its response to global-health crises, Gostin says. 
After it was criticized for reacting too strongly to 
the 2009 H1N1 influenza epidemic — declaring 
a full-scale pandemic, when the virus itself did 
not prove as deadly as was initially feared — it 
dialed back its response to the Ebola outbreak. 
Now the WHO is mounting an urgent response 
to Zika, in light of criticism of its reaction to 
Ebola. To Gostin, this inconsistency reinforces 
a perception that the WHO acts mainly on the 
basis of political, not medical, factors. “We need 
to stop fighting the last war,” he says. m 
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Successful test drive for space-based gravitational- 
wave detector 


Mission paves the way for planned €1-billion space observatory. 
Elizabeth Gibney 
25 February 2016 


Scientists have long dreamed of launching a constellation of detectors into space to observe 
gravitational waves — the ripples in space-time predicted by Albert Einstein and observed for the first 
time earlier this month. 


That dream is now a step closer to reality. Researchers working on a €400- 
million (US$440-million) mission to try out the necessary technology in space 
for the first time — involving firing lasers between metal cubes in free fall — 
have told Nature that the initial test drive is performing just as well as they had 
hoped. 


Nature Special: 
Gravitational Waves 


“| think we can now Say that the principle has worked,” says Paul McNamara, 
project scientist for the LISA Pathfinder mission, which launched last 
December. “We believe that we now are in a good shape to look to the future and look to the next 
generation.” 


“Everything works as we designed it. It’s sort of magical, and you rarely see that in your career as an 


experimentalist,” says Stefano Vitale, a physicist at the University of Trento in Italy, and a principal 
investigator for the Pathfinder mission. 
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PRECISION LAB IN SPACE 


LISA Pathfinder has shown that an intricate experiment consisting of two metal 
cubes in freefall, isolated from all forces except gravity, can operate in space. 


At the heart of Pathfinder are two 

free-falling metal cubes, shielded from Any disturbance to the relative 

all forces except gravity by their housing. motion of the cubes affects 
the frequency of the laser 
bouncing between them. 


The housing monitors each 
cube’s position and commands The cubes float in a vacuum, 
the craft to move so that the surrounded by instruments 
cube is always at its centre. that mitigate stray forces. 


enature 


Source: ESA/ATG medialab 


The European Space Agency (ESA) financed the test, and hopes ultimately to launch a €1-billion 
observatory to hunt for gravitational waves. For that mission, lasers would be bounced between three 
spacecraft set millions of kilometres apart. Each craft would contain a test mass (a metal cube) that 
would be placed in free fall, protected from any forces except that of gravity. Because gravitational 
waves stretch and compress space-time, the observatory hopes to be able to see passing waves by 
using the lasers to detect minute changes in the distance between the free-falling cubes. 


Because of its enormous scale, a space-based observatory could detect lower-frequency gravitational 
waves than can Earth-based experiments — such as the US Advanced Laser Interferometer 
Gravitational-Wave Observatory, which announced a first successful detection on 11 February. Lower- 
frequency waves can be triggered by more-powerful events, which scientists hope to study, such as 
collisions between galaxies and supermassive black holes. 


The Pathfinder mission aimed to show — on a much smaller scale — that the basic design works, and to 


chart its limitations. It uses two test masses (each a 2-kilogram cube of gold and platinum) set 38 
centimetres apart, floats them in isolation from everything except the influence of gravity, and tests to 
see whether changes in their relative movement can be measured with an accuracy of a picometre, 
100,000th of the width of a human hair. To keep the cubes in free fall, the spacecraft monitors their 
motions and uses tiny thrusters to keep itself centred on the masses. 
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The complexity of such an experiment, carried out millions of kilometres from Earth, meant that sceptics 
doubted whether it could ever work in space, says Vitale. But data that have been streamed back since 
23 February, when Pathfinder began to use the lasers to track its released cubes, show that it not only 
fulfils its requirements but exceeds them, he says. For now, the team is keeping under wraps details on 
exactly how well the instruments are performing. 


Proving that the basic technology works is only the mission’s first step. Its main science goal, which the 
Pathfinder team will work on over the coming months, is to understand where 'noise' in the system is 
coming from. That knowledge will be essential in designing the space-based observatory, which is 
scheduled for launch in 2034. “The main goal of the mission is not so much to measure how well we’re 
doing, but to understand how well we're doing,” says McNamara. 


Success of Pathfinder was seen as a prerequisite for building the observatory, which ESA agreed to 
fund in 2013. But before launching such an ambitious experiment, scientists also considered it desirable 
that gravitational waves should already have been seen on Earth-based detectors. “It looks like these 
two conditions have been fulfilled in the same month. So it’s really our month,” adds Vitale. 
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antonio carlos pocob motta » 2016-03-02 02:00 PM 

| believe that the gravitational Waves Must permit that can to know deeply beyond the evento 
horizons where the GTR might "break",new distortions in the spacetiime métrics altering the 
speed of light, how much major the curvatures of spacetime major Must to be the deformations 
suffered by the speed of light?Due that breaks in GTR the singularity cannot occur,and some 
minimum lenght as the possible tiny of the strings might appear as fundamental vibrations and 
theirs harmonics admits the speed Must be variable,and the tiny of strings are 
spacetime,generated by the switches of left handed and right handed that are 
asymmetrics((particles-antiparticles)then eachtimes "particles" vibrate in a frequency only one(it 
isthe spacetime continuum,that is generated by particles that run forward in time and at the time 
run backward in time as antiparticles). Then the spacetime are tiny of strings with is 
harmonics,that are quantum entanglement.the in those regions of strongest energy of the black 
holes deform the curves of spacetime through ripples of gravitational Waves,as well as warping 
the quantum vaccum generating stronger fluctuations in the spacetime geometry being the 
topology not totally smoothes ,that are knots in the fabricas of spacetime,or foram spacetime 


Pentcho Valev » 2016-02-25 07:47 PM 

Variable Speed of Light, No Gravitational Waves 
http://www.damtp.cam.ac.uk/user/tong/concepts/gr.pdf David Tong: "The rocket has height h. It 
starts from rest, and moves with constant acceleration g. Light emitted from the top of the rocket 
is received below. By this time, the rocket is travelling at speed v=gt=gh/c. This gives rise to the 
Doppler effect. (Neglecting relativistic effects). f=f(1+v/c)=f(1+gh/c*2), where f' is received 
frequency and f is emitted frequency." Since f=c/A (A is the wavelength), we have f' = f(1+v/c) = 
(c+v)/A where c'=c+v is the speed of the light relative to the receiver (the bottom of the rocket). 
Clearly the speed of light (relative to the receiver) varies with the speed of the receiver, in 
violation of Einstein's relativity. This means that David Tong's subsequent derivation of 
gravitational time dilation is invalid. There is no gravitational time dilation, and accordingly there 
are no gravitational waves. Pentcho Valev 


& Pentcho Valev + 2016-02-26 08:27 AM 
Clever Einsteinians know that there is no gravitational time dilation: 
http://www. printsasia.com/book/relativity-and-its-roots-banesh-hoffmann-0486406768 
Banesh Hoffmann: "In an accelerated sky laboratory, and therefore also in the 
corresponding earth laboratory, the frequence of arrival of light pulses is lower than the 
ticking rate of the upper clocks even though all the clocks go at the same rate. (...) Asa 
result the experimenter at the ceiling of the sky laboratory will see with his own eyes that the 
floor clock is going at a slower rate than the ceiling clock - even though, as | have stressed, 
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both are going at the same rate. (...) The gravitational red shift does not arise from changes 
in the intrinsic rates of clocks. It arises from what befalls light signals as they traverse space 
and time in the presence of gravitation." What befalls light signals as they traverse space 
and time in the presence of gravitation? They accelerate of course, just as ordinary falling 
objects do, and this variation of the speed of light predicted by Newton's emission theory of 
light causes the gravitational redshift (or blueshift): 
http://courses.physics. illinois.edu/phys419/sp2013/Lectures/I13.pdf University of Illinois at 
Urbana-Champaign: "Consider a falling object. ITS SPEED INCREASES AS IT |S 
FALLING. Hence, if we were to associate a frequency with that object the frequency should 
increase accordingly as it falls to earth. Because of the equivalence between gravitational 
and inertial mass, WE SHOULD OBSERVE THE SAME EFFECT FOR LIGHT. So lets 
shine a light beam from the top of a very tall building. If we can measure the frequency shift 
as the light beam descends the building, we should be able to discern how gravity affects a 
falling light beam. This was done by Pound and Rebka in 1960. They shone a light from the 
top of the Jefferson tower at Harvard and measured the frequency shift. The frequency shift 
was tiny but in agreement with the theoretical prediction." http://www.einstein- 
online.info/spotlights/redshift_white_dwarfs Albert Einstein Institute: "One of the three 
classical tests for general relativity is the gravitational redshift of light or other forms of 
electromagnetic radiation. However, in contrast to the other two tests - the gravitational 
deflection of light and the relativistic perihelion shift -, you do not need general relativity to 
derive the correct prediction for the gravitational redshift. A combination of Newtonian 
gravity, a particle theory of light, and the weak equivalence principle (gravitating mass 
equals inertial mass) suffices. (...) The gravitational redshift was first measured on earth in 
1960-65 by Pound, Rebka, and Snider at Harvard University..." And since there is no 
gravitational time dilation, there are no gravitational waves either. Pentcho Valev 


Y Alone: bad. Friend: good! » 2016-02-26 09:30 AM 
Do you think space has a fabric or a medium or anything other than being literally 
empty and nothing? 


Pentcho Valev ° 2016-02-25 12:41 PM 

"gravitational waves — ripples in space-time first predicted by Albert Einstein" There was no such 
prediction. LIGO folks use the same myth: 
https://science.house.gov/sites/republicans.science.house.gov/files/documents/HHRG-114-SY- 
WState-DReitze-20160224.pdf Testimony of Dr. David Reitze, Executive Director LIGO 
Laboratory, California Institute of Technology, Before the U.S. House of Representatives, 
Committee on Science, Space and Technology, on Unlocking the Secrets of the Universe: 
Gravitational Waves, February 24, 2016: "Let me now turn to the science of LIGO. This is what 
excites us the most! General relativity tells us that space-time is warped, that gravity is 
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geometric, and that black holes exist. These are complex concepts, deriving from a 
mathematically intricate but elegant theory. It also predicts the existence of gravitational waves. 
(...) It is worth pointing out that even though Einstein derived the existence of gravitational waves 
as a natural consequence of general relativity in 1916, he himself doubted they would ever be 
detected because the waves are so incredibly small." David Reitze misled the Committee: 
Einstein did not doubt "they would ever be detected because the waves are so incredibly small". 
Rather, Einstein doubted gravitational waves existed at all: http://arxiv.org/abs/1602.04674 
"Around 1936, Einstein wrote to his close friend Max Born telling him that, together with Nathan 
Rosen, he had arrived at the interesting result that gravitational waves did not exist, though they 
had been assumed a certainty to the first approximation. He finally had found a mistake in his 
1936 paper with Rosen and believed that gravitational waves do exist. However, in 1938, 
Einstein again obtained the result that there could be no gravitational waves!" 
https://www.quantamagazine.org/20160218-gravitational-waves-kennefick-interview/ ""There are 


no gravitational waves ..." ... "Plane gravitational waves, traveling along the positive X-axis, can 


therefore be found ..."..." ... gravitational waves do not exist ..." ..."Do gravitational waves 
exist?" ... "It turns out that rigorous solutions exist ..."" These are the words of Albert Einstein. 
For 20 years he equivocated about gravitational waves, unsure whether these undulations in the 
fabric of space and time were predicted or ruled out by his revolutionary 1915 theory of general 
relativity. For all the theory's conceptual elegance -- it revealed gravity to be the effect of curves 


in "space-time" -- its mathematics was enormously complex." Pentcho Valev 


© brian banton + 2016-02-25 04:51 PM 
We can detect light waves travelling through the cosmic background radiation (CMB) 
because we possess a very efficient organ which has evolved to do so. We have no such 
organ to detect gravitational waves unfortunately, but that is not a sensible reason to deny 
their existence. However, "Ripples in the fabric of spacetime" is utter nonsense. Space is 
space and time is time and they do not mix, no matter how much the mathematicians would 
like them to. Einstein was wrong about relativity because he was not aware of the CMB and 
its ability to provide a universal reference frame for all light. Had he been so aware, he 
would have been much happier, relativity would not have existed and progress would have 
been much faster. 
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Epic El Nino yields massive data trove 


Waning warming event studied in unprecedented detail. 
Jeff Tollefson 


02 March 2016 


° — 
Reinhard Dirscherl/Ullstein Bild via Getty 


Rising ocean temperatures caused by El Nifo have damaged coral in the Indian Ocean. 


Floods have ravaged parts of South America. Crops are drying up in Africa. Corals are bleaching 
around the world. The epic El Nino warming event in the tropical Pacific Ocean has boosted 
temperatures and affected people and ecosystems around the globe. And thanks to a combination of 
luck and determination, scientists have been better placed than ever to record its evolution. 


“When you have an event this big, you really want to squeeze as much out of it as you can,” says 
Michael McPhaden, an oceanographer at the US National Oceanic and Atmospheric Administration 
(NOAA) in Seattle, Washington, who helps to manage an array of buoys that is used to monitor El Nifo 
conditions. “And we were well positioned at the beginning of 2015 to watch this thing unfold.” 


Had the El Nifo occurred a year earlier — as originally predicted — McPhaden’s team would not have 
been ready. In early 2014, the Tropical Atmosphere Ocean (TAO) array was on the verge of collapse. 
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NOAA eventually restored the array after oceanographers raised concerns that they — as well as 
weather forecasters — would be deprived of crucial data if a major El Nifo arrived. 


El Nifo and its counterpart, La Niha — which is defined by a cooling of Raided Stories 


the equatorial Pacific — have powerful knock-on effects around the 


globe. This oscillation forms the basis of most seasonal weather = Monee waneiprowen 


predictions, and scientists are mining the data for clues that will py meteorologists 


improve those predictions. * Hunting the Godzilla El 
Nifio 

NOAA has spent roughly US$3 million to deploy aircraft, a research ¢ Corals worldwide hit by 

ship and hundreds of weather balloons to capture as much data as bleaching 


possible before the El Nifio fades in the next few months. And the US 

National Science Foundation (NSF) has awarded 19 ‘rapid response’ More related stories 
research grants, totalling $2.3 million, to researchers studying the 

event. 


Many of the NSF projects focus on the effects of warm seawater on coral reefs; if water is too hot, corals 
bleach, expelling the colourful algae that feed them. The current bleaching event began in Guam in 
2014 and has since spread to the Atlantic and Indian oceans as El Nifo has warmed the seas (see 
‘Selected El Nifo impacts’). More than 60% of the world’s corals could be affected in the next few 
months, and with NOAA predicting that bleaching will continue into 2017, that number could rise. 


Selected El Nihno impacts 
The strong El Niho warming event that began last year has affected conditions around the globe. 


Eastern | The failure of spring rains in 2015 affected crops in central Ethiopia and Sudan. Areas 
Africa further south have been hit by heavy rains and flooding. 


South Paraguay was hit by torrential rains this December and January, causing flooding and the 
America | evacuation of more than 100,000 people. 


Australia El Nifo has been linked to warm and dry conditions in 2015. 


India Much of India experienced below-normal rainfall from June to September. There was 
above-normal rainfall in southern India and Sri Lanka this winter. 


Oceania _ Coral reefs in the central Pacific Ocean are on the front line of a global bleaching event that 
has also affected reefs in the Atlantic and Indian oceans. 


Source: WMO 


Scientists and policymakers often focus on the long-term risk posed by carbon dioxide emissions and 
ocean acidification. The current bleaching event — already the longest on record — has shone a 
spotlight on the immediate danger posed by rising ocean temperatures. 


“Ocean acidification may be much less of a problem than we feared, but that’s only because many of the 
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corals will be dead before we get there,” says Mark Eakin, who heads NOAA’s Coral Reef Watch. 


In other cases, scientists were lucky to be able to watch El Nifo unfold. Daniel Rudnick, an 
oceanographer at the University of California, San Diego, and his colleagues received grants from the 
NSF and NOAA in 2012 and 2013 to deploy a trio of underwater gliders in the tropical Pacific from 
2013-16. The team also released 41 new Argo floats — roughly double the previous number — along 
the Equator to collect temperature and salinity data down to 2,000 metres. 


In November 2013, the gliders began to collect high-resolution 
“When you have an event 
ae measurements of the subsurface ocean current’s eastward flow. 
this big, you really want to a ; 
. The monitoring will continue this year. 
squeeze as much out of it as 


you can.” 
“It’s been shown that the undercurrent strengthens during El Nifo, 


but what we have, | think, is a more finely resolved picture of its 


size,” Rudnick says. “It was serendipity with a capital ‘S’. 


Rudnick presented preliminary data on 23 February at an American Geophysical Union conference in 
New Orleans, Louisiana. The observations document the under-current’s evolution in 2014, when the 
forecasted El Nifio fizzled, and in 2015, when it roared back. By comparing the oceanic and atmospheric 
conditions in both years, scientists hope to gain insights that will improve future weather forecasts. 


The focus is often on the trade winds, which usually blow to the west across the Equator. Alexey 
Fedorov, an oceanographer at Yale University in New Haven, Connecticut, says that 2014 and 2015 
both saw significant bursts of eastward winds in June, which began to push warm surface waters 
towards South America. In 2014, a burst of westward winds in August, driven by atmospheric patterns in 
the Southern Hemisphere, cut that process short — but helped to set the stage for the massive El Nifo 
in 2015. 


Fedorov says that there is no evidence that those winds could have been forecast, even by the most 
advanced climate models. “Sometimes we are right, sometimes we are wrong,” he says. “There was no 
chance to predict this.” 

Although temperatures in the tropical Pacific are near their peak, McPhaden says that El Nifo’s energy 
is quickly dissipating below the surface. “It’s very clear that this El Nifo is losing its steam,” he says. And 
that poses another crucial question for oceanographers and climate scientists: whether this El Nifo will 
transition into a major La Nifa, as happened after the last big El Nifo in 1997-98. 
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e Monster El Nino probed by meteorologists 
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e Corals worldwide hit by bleaching 
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e Climate change: The case of the missing heat 
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GAN FRACKING POWER EUROPE? 


Several countries hope to unleash vast natural-gas reserves through fracking, 
but drilling attempts have been disappointing. 


arge petroleum pumps nodded up and 

down in the background as British Prime 

Minister David Cameron donned a blue 
industrial jumpsuit to promote a controversial 
drilling technique known as hydraulic fractur- 
ing, or fracking. In his 2014 visit to a potential 
drill site in eastern England, Cameron laid out 
the benefits of tapping Britain's shale formations 
to release valuable natural gas. “We're going all 
out for shale,” he said. “It will mean more jobs 
and opportunities for people, and economic 
security for our country.” 

Cameron hopes to replicate the surge in 
natural-gas production that has happened in 
the United States thanks to fracking — which 
involves injecting fluids into shale to liber- 
ate locked-up hydrocarbon deposits. The 
fracking revolution helped to revitalize the 
US economy, and Cameron's Conservative 
Party seeks to spark a similar gas boom in the 
United Kingdom. In August last year, his newly 
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elected government offered drilling licences 
for shale deposits and it touted estimates that 
“investment in shale could reach £33 billion 
[US$46 billion] and support 64,000 jobs”. 
Over the past few years, fracking fever has 
swept through several European nations, 
including Denmark, Lithuania, Romania and 
especially Poland, which has seen more shale 
exploration than any other nation on the con- 
tinent. Fracking might help to boost gas pro- 
duction in Europe at a time when it is facing 
a sharp decline. Older gas fields in the North 
Sea are running out, as are deposits in Ger- 
many, Italy and Romania. The disappointing 
output has increased Europe’s dependence on 
imported gas, mainly from Russia. European 
leaders have grown wary of relying on that 
source, especially after diplomatic relations 
chilled when Russia invaded Ukraine in 2014. 


© 2016 Macmillan Publishers Limited. All rights reserved 


But Europe’ appetite for gas could increase as it 
tries to cut greenhouse-gas emissions — which 
will probably require reducing coal consump- 
tion (see ‘Looming gas crunch?’). The European 
Commission says that “gas will be critical for 
the transformation of the energy system”. 

This means that countries such as the United 
Kingdom have invested an immense amount 
of hope in shale gas. But a close examination of 
the industry suggests that any fracking boom 
in Europe is a long way off — and some experts 
say that it may never arrive. 

Despite several years of exploratory drilling, 
there are currently no commercial shale-gas 
wells in Europe. Tests of the region's shale poten- 
tial have been limited, and the results so far have 
been generally disappointing, say geologists and 
energy experts. It remains highly uncertain how 
much gas would be recoverable with today’s 
technologies, and even more difficult to fore- 
cast how much would be profitable to extract. 


BARTEK SADOWSKI/BLOOMBERG/GETTY 


All that leads to 
big questions about 
Europe's shale hopes, 
says Jonathan Stern, 
a natural-gas expert at the Oxford Institute 
for Energy Studies in Oxford, UK. “There has 
been an enormous amount of ridiculous hype 
about shale gas in Europe.” 


Fracking attempts in 
Poland have not led 
to commercial wells. 


WAITING FOR A REVOLUTION 

A decade ago, the United States was facing a 
similarly dismal outlook for natural gas. Pro- 
duction from conventional fields was petering 
out, and geologists did not expect that alterna- 
tive sources of gas could compensate for the 
shortfall. But within a few years, the picture 
suddenly brightened owing to improved drill- 
ing and fracking technologies, which tapped 
previously inaccessible gas reserves and 
unleashed a boom dubbed the shale revolu- 
tion. Shale is almost impermeable to oil and 
gas, SO companies must fracture the rock to 
liberate those hydrocarbons. 

The idea that a similar wealth of untapped 
energy could be lurking in the rocks below 
Europe is economically appealing. But geolo- 
gists know relatively little about the potential 
of shale-rock formations in Europe because 
there has been less onshore drilling than in 
the United States. European companies have 
sometimes drilled through shale to reach other 
rock formations, but they have rarely taken 
detailed measurements or collected samples 
of the shale layers. 

So far, Poland’s shale formations have 
attracted the most attention within the region. 
The nation depends heavily on coal, and what 
natural gas it does use comes almost exclusively 
from Russia. In the mid-2000s, the burgeon- 
ing US shale boom prompted Poland’s gov- 
ernment to offer shale exploration licences 
that went to local companies as well as major 
international energy firms, including the US 
companies ExxonMobil and Chevron, and the 
French firm Total. Poland’s foreign minister, 
Radostaw Sikorski, said in 2010 that Poland 
would become “a second Norway” — refer- 
ring to Europe’s second-largest natural-gas 
producer, after Russia. 

The excitement was bolstered in 2011 by an 
assessment from Advanced Resources Interna- 
tional (ARI), a consultancy in Washington DC 
that was commissioned by the US Department 
of Energy to study shale-gas resources world- 
wide. That study estimated the quantity of shale 
rock and other parameters such as the total 
organic content of the rock, which is the source 
of oil and gas. ARI also estimated parameters 
to represent the risk that some shale zones, or 
plays, might not prove promising or that only a 
portion of them might be amenable to drilling. 
Given these assumptions, ARI calculated that 
Poland’s shale-gas plays hold about 5,295 bil- 
lion cubic metres (bcm) of technically recov- 
erable gas, the most shale gas of any nation in 
Europe. If all of that gas could be extracted, it 


would be equivalent to 325 years of Poland’s 
current gas consumption’. 

While companies began drilling dozens 
of test wells in Poland, the Polish Geological 
Institute (PGI) in Warsaw made its own esti- 
mate in March 2012. Taking the considerable 
uncertainty over the data into account, the 
PGI calculated that Poland has 346-768 bcm 
of recoverable shale gas onshore — about one- 
tenth of ART’s figure’. 

Then in July 2012, the US Geological Sur- 
vey (USGS) released another study of Poland's 
shale-gas resources. The agency assumed that 
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DISMISS POLAND'S 

EXTENSIVE SHALE 
POTENTIAL.” 


individual wells would yield about half as much 
gas as the PGI assumed and that the area that is 
likely to contain recoverable gas is only about 
one-third of the size. So the USGS wound up 
with an estimate even smaller than the other 
two, with a mean result of just 38 bcm of recov- 
erable gas, and a huge range of uncertainty, 
from 0 to 116 bcm. The mean was about one- 
tenth that of the PGI’s estimate, and about 
one-hundredth of ARTs’. 

“One report — huge potential. A year later 
— nothing,’ says PGI geologist Hubert Kier- 
snowski. “The scale of uncertainty is so big” 

Meanwhile, results started coming in from 
test wells. Of the 72 wells drilled by the end of 
2015, 25 were successfully fracked to release 
gas. However, these wells yielded only about 
one-third to one-tenth of the flow that would 
be required to turn a profit, says petroleum 
geologist Pawel Poprawa of AGH University 
of Science and Technology in Krakow, Poland, 
and formerly of the PGI. None of the wells has 
become a commercial producer. 

At the peak of interest in early 2013, com- 
panies held shale-drilling licences covering 
about one-third of Poland. But throughout 
2013 and 2014, the major international energy 
firms gave up their shale-exploration licences 
and left the country, often citing disappoint- 
ing results. The last to leave was Texas-based 
ConocoPhillips in June 2015 — now Poland’s 
shale drilling is almost at a standstill. 

One major hurdle to development is that 
Poland’s shale is expensive to drill because it is 
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buried around 3-5 kilometres down, compared 
with around 1-2 kilometres for most success- 
ful US plays. Some of Poland’s shale also has a 
high clay content, which makes the rock harder 
to fracture. And exploratory holes into one of 
Poland's most promising shale formations — in 
the north, near the Baltic Sea — showed that it 
held a geological barrier that would limit how 
much gas could be tapped by individual wells, 
says Poprawa. The drilling results suggest that 
ARI “overestimated the acreage, the thickness, 
and the quality of the shale’, he says. 

The PGI says that its previous lower esti- 
mates are reinforced by its latest, as-yet- 
unpublished assessment, which draws on recent 
shale-drilling tests. PGI spokesperson Andrzej 
Rudnicki calls ARI’s much higher estimates 
“enthusiastic, but geologically unrealistic’. 

“The results in Poland to date indeed have 
been disappointing,” concedes geologist Scott 
Stevens of ARI. He says that the main reason 
for the unproductive wells was “extremely 
high” stresses in the rock, which makes frack- 
ing less effective. “There was no way that the 
exploration companies could know that in 
advance,’ he notes. Nonetheless, he argues, “It 
is too soon to dismiss Poland’s extensive shale 
potential.” Given the limited available data, he 
does not see a reason to revise ARI’s estimate. 

Even the PGI’s lower estimates suggest 
that there is a still a substantial amount of gas 
trapped in Poland’s shale. However, it is uncer- 
tain whether any of that gas will be profitable to 
extract. “Iam still hopeful” Poprawa says. “But 
the initial hopes were not realistic” 


DASH FOR GAS 

Although companies raced to grab concessions 
in Poland, activity in the United Kingdom has 
been subdued. In 2011, Cuadrilla Resources 
fracked the United Kingdoms first shale well 
near Blackpool in northern England, but this 
triggered two small earthquakes, which led the 
government to place a year-long moratorium on 
further fracking. After the moratorium lifted, 
companies slowly began vying to tap UK shale. 

According to a 2013 assessment by ARI, UK 
shale holds 17,600 bcm of gas. Only 728 bcm of 
this is judged to be technically recoverable: if that 
could be profitably extracted, it would satisfy the 
United Kingdoms gas needs for about a decade”. 

The British Geological Survey (BGS) has 
assessed the shale-gas resources in the United 
Kingdom’s three major plays by construct- 
ing a 3D model of the subsurface using drill- 
ing records and seismic surveys, which has 
allowed it to roughly estimate the volume of 
shale rock. But geologist Ian Andrews of the 
BGS insists that this estimate is just a first pass 
based on the seismic information available, 
“which is sparse, and fairly poor”. 

By testing old rock cores stored by the gov- 
ernment, the BGS was also able to measure 
some of the properties of UK shale, such as 
the total organic carbon (TOC) content. Suc- 
cessful shale plays in the United States typically 
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LOOMING GAS CRUNCH? 


Domestic production of natural gas has fallen in Europe (except Russia) and forecasts predict further declines. 
That could force nations to increase their imports, depending on consumption trends. Fracking could unlock 
additional supplies, but drilling results so far have not been promising. 
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have TOC values greater than 2%. Although 
TOC measurements for the United Kingdom 
are scant, the available data suggest that there 
are large volumes of rock above the 2% thresh- 
old. But data are lacking for other key param- 
eters, such as the rock porosity, which adds 
greatly to the uncertainty of these projections. 

The BGS estimated that the three shale plays 
it has assessed so far hold around 39,900 bem 
of gas, with an uncertainty range of 24,700- 
68,400 bcm (refs 5,6). This is more than the 
ARI estimate, but that study only considered 
the most promising rock. The BGS did not 
attempt to estimate how much of that gas 
would be technically recoverable. “How much 
we can get out of the ground, I don’t think 
anybody knows yet, because the drilling hasn’t 
happened to test it,” says Andrews. 

Although the BGS’s studies used US shale 
plays as analogues for crucial parameters, the 
two nations have different geological histories. 
The United States has large deposits of shale 
that are not too thick and have been folded 
little over time. The shale in the United King- 
dom is more complicated, says petroleum 
geoscientist Andrew Aplin of the University of 
Durham, UK. “It’s been screwed around with 
more’, creating more folds and faults. 

That greater complexity could pose chal- 
lenges. One risk is that pumping fluid into rock 
can trigger earthquakes if the wells are near 
faults or large natural fractures. “It’s better to 
stay away from them, especially when they're 
located near densely populated areas,’ says 
natural-gas expert Rene Peters of the Neth- 
erlands Organisation for Applied Scientific 
Research (TNO) in the Hague. But there has 
been relatively little high-resolution seismic 
imaging in Europe, he says, so “not all these 
fractures are known”. Small faults can pose 
another challenge. If the fracking fluid leaks 
into a fault, the pressure on the rock is reduced 
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and the fracking is less effective. Given the 
geological hurdles and the United Kingdom's 
dense population, it may prove difficult to find 
many promising, acceptable places to drill. 


FAR FROM PROFITABILITY 
The United Kingdom’s appetite for gas is 
expected to grow sharply. In November, the 
government set out the goal of phasing out 
coal-fired power plants by 2025, unless they 
have carbon capture and storage systems. The 
government expects nuclear, wind and solar 
power to play a part in filling the void left by 
coal — but natural gas would be the linchpin 
because it produces less carbon dioxide and 
other pollution than does coal, and existing 
infrastructure can be used to produce electric- 
ity from gas. “We'll only proceed if we're con- 
fident that the shift to new gas can be achieved 
within these timescales,’ UK energy secretary 
Amber Rudd said in a speech announcing the 
policy shift. “We currently import around half 
of our gas needs, but by 2030 that could be as 
high as 75%. That’s why we're encouraging 
investment in our shale-gas exploration so we 
can add new sources of home-grown supply,’ 

Other European nations are also counting 
on natural gas to help them to cut their coal 
use and meet their commitments under the 
United Nations climate treaty signed in Paris 
in December. But shale gas may not provide 
the answer. At the June 2015 World Gas Con- 
ference in Paris, industry speakers were pessi- 
mistic that Europe would see a fracking boom 
like that in the United States. Philippe Charlez, 
manager of unconventional resources develop- 
ment at Total, said that given the current costs 
for shale wells, “we are very, very far in Europe 
from profitability”. 

Many assessments in the past two years — 
including those by the International Energy 
Agency and oil giants BP and ExxonMobil 
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— agree that Europe is unlikely to produce 
much shale gas, and that conventional gas 
production will continue to decline’°. And 
if gas imports cannot make up the difference, 
says Stern, “Europe is going to have even more 
difficulty reducing carbon emissions”. 

The most recent signs are not good for shale 
across the continent. Besides retreating from 
Poland, major petroleum companies have 
pulled out of nascent shale drilling efforts in 
Romania, Lithuania and Denmark, usually cit- 
ing disappointing yields. Various members of 
the European Union from Bulgaria to France 
have instituted moratoria or bans on fracking, 
as have Scotland, Wales and Northern Ireland, 
all citing environmental concerns. 

England is home to some of the few remain- 
ing attempts to tap shale gas in Europe. A 
handful of companies have applied for permis- 
sion to drill, which could finally reveal whether 
the United Kingdom’s shale deposits will be a 
jackpot or a dud. But environmentalists have 
put up a strong fight, and permissions have 
been slow to emerge. 

Cuadrilla requested approval in January 
2015 to drill beneath the undulating fields of 
Lancashire, but the county council rejected 
the request in June over concerns about traf- 
fic, noise and the visual impact of drilling. That 
decision and the broader difficulties that con- 
front fracking in Europe leave the future of nat- 
ural gas there in limbo. To figure out whether 
any play has potential, companies must drill as 
many as 50 to 100 wells. But the public opposi- 
tion and the poor drilling results so far mean 
that companies are not eager to sink that kind 
of effort into fracking in Europe right now, says 
Stern. “I can’t see any country, including the 
UK, where that will happen anytime soon” = 


Mason Inman is a reporter in Oakland, 
California. Travel for this article was 
supported by the European Geosciences 
Union's Science Journalism Fellowship. 
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Virus- sized particles that fluoresce in 
every colour could revolutionize applications 
from television displays to cancer treatment. 


HE NANOSCALE 


t Biopolis, a sprawling research complex 
in Singapore, Chi Ching Goh leans over 
an anaesthetized mouse lying on the 
table in front of her, and carefully injects it with a 
bright yellow solution. She then gently positions 
the mouse’s ear underneath a microscope, and 
flips a switch to bathe the ear in ultraviolet light. 
Seen through the microscope’s eyepiece, the illu- 
mination makes the blood underneath the skin 
glow green, tracing the delicate vessels that carry 
the solution through the creature's body. 
Ultimately, Goh, a PhD candidate at the 
National University of Singapore, hopes that 
the method will help her to find blood vessels 
that are leaking owing to inflammation, perhaps 
helping to detect malaria or predict strokes. 
Crucial to her technique are the virus-sized 
particles that give the solution its colour. Just a 
few tens of nanometres across, they are among 
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a growing array of ‘nanolights’ that researchers 
are tailoring to specific types of fluorescence: the 
ability to absorb light at one wavelength and re- 
emit it at another. 

Many naturally occurring compounds can do 
this, from jellyfish proteins to some rare-earth 
compounds. But nanolights tend to be much 
more stable, versatile and easier to prepare — 
which makes them attractive for users in both 
industry and academia. 

The best-established examples are quan- 
tum dots: tiny flecks of semiconductor that are 
prized for their beautiful, crisp colours. Now, 
however, other types of nanolight are on the 
rise. Some have a rare ability to absorb lots of 
low-energy photons and combine the energy 
into a handful of high-energy photons — a trick 
that opens up opportunities such as producing 
multiple colours at once. Others are made from 
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polymers or small organic molecules. These are 
less toxic than quantum dots and often outshine 
them — much to the amazement of chemists, 
who are used to carbon-based compounds sim- 
ply degrading in the presence of ultraviolet light. 

“I was kind of surprised to find that we can 
make organic particles much brighter than inor- 
ganic particles,” says Bin Liu, a chemical engi- 
neer at the National University of Singapore and 
the designer of the fluorescent nanoparticles 
that Goh is using. 

Nanolights have already begun to find 
application in areas ranging from flat-screen 
displays to biochemical tests. And research- 
ers are working towards even more ambitious 
uses in fields such as solar energy, DNA map- 
ping, motion sensing and even surgery. “The 
research is certainly fast-paced,’ says Daniel 
Chiu, who studies fluorescent nanoparticles 
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Solutions of quantum dots made from cadmium 
selenide respond to ultraviolet light by emitting 
visible light in specific wavelengths. 


at the University of Washington in Seattle. 

It is also increasingly wide ranging, adds Paul 
Alivisatos, a chemist at the University of Cali- 
fornia, Berkeley, and a co-founder of the first 
quantum-dot technology companies. “It’s so 
much fun now. 


SIZE MATTERS 
The nanolight era began with the discovery of 
quantum dots in 1981. Russian physicists were 
growing tiny crystals of the semiconductor 
cuprous chloride in silicate glass and observed 
that the colour of the glass depended on the 
size of the particles’. The crystals were so small 
that quantum effects were kicking in and they 
were behaving somewhat like atoms: they could 
absorb or emit light only as specific colours, 
with the exact frequencies depending on the size 
or shape of the particles (see ‘Bridge the gap’). 
The quantum dots were bright and beauti- 
ful, says Yin Thai Chan, who studies them at 
the National University of Singapore, but “there 
were no obvious applications”. By the early 
2000s, however, the pure colours had begun to 
attract television manufacturers, as well as bio- 
medical researchers, who saw their potential for 
labelling specific proteins and DNA segments. 
“Everything is good about quantum dots,” 
says Liu — except for one thing: their toxicity. 
The best-performing dots contain cadmium, 
which can poison cells. This limits their use- 
fulness in biology and in applications such as 
household electronics, because some countries 
do not allow use of the element in such devices. 
To some extent, this problem can be overcome 
by replacing cadmium with zinc or indium, 
which are considerably less toxic, or by wrap- 
ping cadmium-based quantum dots in poly- 
mers that are biocompatible. But the toxicity is 
still a drawback for researchers who are pursu- 
ing ambitious applications such as fluoresence- 
guided surgery, in which nanoparticles are 
injected into a tumour, for instance, to make it 
glow and help surgeons to remove all traces of it. 


GOING ORGANIC 

Partly in response to this challenge, research- 
ers have begun to develop nanoparticles from 
materials that fluoresce naturally. Because the 
light-emitting properties of these nanolights 
come from their composition rather than their 
size or shape, they are easier to make with spe- 
cific colours. “Practically, this is useful because 
of the difficulties to synthesize everything in the 
same size,’ says Chiu. 

It also frees up nanolight researchers to 
explore alternative materials, such as semi- 
conducting polymers. Studied for their 
potential in electronics since the 1950s, 
these polymers consist of simple compounds 
linked into a long chain in which electrons 
are free to move, but only at certain energies 


determined by the chain’s composition. 

Light is emitted when electrons are kicked 
up to higher energy levels by some outside 
source, such as ultraviolet light, then fall back 
down to lower levels. The polymers can also be 
decorated with side groups to give them spe- 
cific properties — for example, targeting them 
to cancer cells, or helping them to dissolve in 
water. And when chains are aggregated into 
polymer nanoparticles, or ‘P-dots; they can be 
as much as 30 times brighter than a quantum 
dot of comparable size’. 

Semiconducting polymers do tend to be less 
stable than the inorganic semiconductors used 
in quantum dots. But because they are based on 
carbon, and contain no metals, they are much 
more likely to be biocompatible. P-dots have 
been used to stain and image cells, and also as 
sensors to detect oxygen, enzymes or metal ions 
such as copper. 

In 2013, for example, Chiu and his collabora- 
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He called the molecules AIE-gens. 

Over the next few years, Tang and his 
students changed the side groups and intro- 
duced elements such as nitrogen or oxygen, and 
AIE-gens can now glow in the entire spectrum 
of colours from ultraviolet to near-infrared. “My 
students quickly made a lot; says Tang. “We can 
change the colour at will” 

In 2011, Tang met Liu through a collabora- 
tion at the Institute of Materials Research and 
Engineering in Singapore, part of the govern- 
ment-backed Agency for Science, Technology 
and Research (A*STAR). At that time, AIE-gens 
were performing well, except that they could not 
dissolve in water, which made them difficult to 
use in biological applications. Liu was an expert 
in making things water-soluble, so Tang gave 
her some of his best AIE-gens to work with. 

Liu solved the problem by experimenting 
with polymers that are oil-loving on one end 
and water-loving on the other. The AIE-gens 


"WE GAN CHANGE THE COLOUR AT WILL.” 


tors reported that a P-dot bound to a terbium 
ion can detect biomolecules produced by bac- 
terial spores*. Under an ultraviolet lamp, the 
P-dots glow dark blue and the terbium ions 
emit a faint neon green colour. But when pass- 
ing biomolecules attach themselves to terbium, 
the ions’ light strengthens to a bright green. The 
P-dots’ light remains unchanged, so it serves as 
an internal standard. 

Unfortunately, P-dots also have a fundamen- 
tal problem: the polymer molecules are packed 
together so closely that they can be affected by 
‘quenching’ — a phenomenon in which most 
of the energy coming from the original light 
source is quickly dissipated and fails to trigger 
fluorescence. 

Quenching has a huge impact on efficiency, 
says Yang-Hsiang Chan, a chemist at National 
Sun Yat-Sen University in Kaohsiung, Taiwan. 
One way to tackle it is to add bulky groups onto 
the polymer backbone to prevent the polymers 
from getting too close to each other. But this 
can be self-defeating: the resulting nanoparti- 
cles tend to be too fat to get into cells, say, or too 
dim to be useful. “It is very hard to get the right 
balance,” says Chan, who is working to solve the 
problem by designing new polymers. 


TOGETHER WE SHINE 

A more fundamental solution was pioneered in 
2001, when Ben Zhong Tang at the Hong Kong 
University of Science and Technology in Clear 
Water Bay found that a class of small organic 
molecules would fluoresce only when they 
aggregate together*. These molecules are shaped 
like propellers or pinwheels, and they fluoresce 
when packed because they can no longer move 
and waste their energy. Instead, they release 
their energy as light — a phenomenon Tang has 
named aggregation-induced emission (AIE). 
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crowd within the polymer’s oil-loving ends, 
and its water-loving ends point outwards to 
form a protective shell, resulting in a water- 
soluble capsule with a dense core full of AIE- 
gens. Liu designed a protective shell for the 
resulting nanoparticles, called AIE-dots, such 
that it could be decorated with various chemical 
groups that are tailored to specific applications. 
The shell can easily accommodate a wide variety 
of AIE-gens, says Liu, “so that we can screen a 
lot of molecules very quickly to find out which 
one is the best.” 

AIE-dots have been used to stain various 
tissues, from blood vessels to cancer cells to 
intracellular organelles such as mitochon- 
dria. Last year, Liu, Tang and their colleagues 
reported an AIE-dot that could be useful in a 
type of light-activated treatment known as pho- 
todynamic therapy’. It carries two molecules on 
its surface: one to get the dot into a cancer cell, 
and another to make it stick to the mitochon- 
dria. Once excited by an external light source, 
the AIE-dot produces red light that generates 
oxygen radicals near the mitochondria and kills 
the cancer cells. 

The best AIE-dots can be 40 times brighter 
than quantum dots®. “With AIE, high density 
in constrained space produces high brightness,” 
says Guangxue Feng, a research assistant in Liu’s 
lab. That is particularly useful for applications 
such as visualization of tissues or long-term 
tracking of cancer cells, which halve the number 
of nanoparticles per cell every time they divide. 

But the brightness comes at a cost: AIE-dots 
produce a much broader, more-muted spec- 
trum than the pure, brilliant colours of quan- 
tum dots. But that hasn't kept Liu from starting 
LuminiCell, a spin-off company in Singapore 
that produces AIE-dots in three colours and 
three sizes for research such as Goh’s at A*STAR. 
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Tang is also trying to start acompany; 
both he and Liu are now hoping to 
gain approval from the US Food and 
Drug Administration to test AIE-dots 
for human use in applications such as 
fluorescence-guided surgery. 


INTO THE INFRARED 
Another thing that limits the biologi- 
cal use of nanolights is that most of 
them absorb ultraviolet or visible 
light, which can penetrate only a 
few millimetres into tissue. Longer- 
wavelength near-infrared radiation 
can penetrate up to three centimetres 
—a much better depth for uses such 
as releasing drugs. But infrared light 
does not have enough energy to break 
the bonds that hold drugs on the 
nanoparticle, so many researchers are 
turning to a process called upconver- 
sion. This involves making material 
that can absorb multiple low-energy 
infrared photons, accumulate the 
energy and then re-emit it as higher- 
energy ultraviolet or visible photons. 
The group of heavy-metal ele- 
ments known as lanthanides are par- 
ticularly good at this trick. In 2011, 
Xiaogang Liu at the National Uni- 
versity of Singapore reported that his 
laboratory had created a particularly 
versatile type of nanoparticle’ with 
a Russian doll-like structure. It con- 
sists of a series of concentric shells 
that each contains a different com- 
bination of lanthanides. The energy 
from infrared light is absorbed by the 
core, then migrates outwards layer by 
layer, snowballing from lanthanide to 
lanthanide before finally emerging as 
high-energy light near the surface. 
The 15 lanthanides can be com- 
bined in numerous different ways 
to produce nanoparticles that emit 
in all colours, sometimes even sev- 
eral at once. In one demonstration, 
a student in Liu’s lab shone an infra- 
red laser through a series of beak- 
ers containing clear solutions of the 
nanoparticles: glowing lines of purple 
and green light appeared in the beak- 
ers where the infrared beam passed 
through. 


Liu thinks that these upconversion nano- 
particles have tremendous potential in photo- 
voltaics, where they could help to capture 
near-infrared light, which makes up almost half 
of the Sun’s radiation. This is a long way from 
being practical, however: the brightest avail- 
able nanoparticles convert just 10% of the light 
they absorb. Liu’s group is working to build a 
library of these nanoparticles — no small task 
considering the number of lanthanides — to 
systematically study their properties and work 


on making them brighter. 
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BRIDGE THE GAP 


Some virus-sized particles can be tailored to absorb light and fluoresce at 
specific frequencies. Light kicks electrons up to higher energy levels, and 
the glow is emitted when the electrons fall back past a wide energy gap. 


These particles contain layers of crystalline semiconductor material. 
The colour they emit depends on the particle's size. 
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P-DOTS 

These particles are made of semiconducting polymers that emit light when 
they aggregate. They can be brighter than a quantum dot of the same size, 
but the glow fades if the strands bunch too tightly. 
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UPCONVERSION PARTICLES 


Particles made from 
layers of heavy- 

metal lanthanide 
elements can 
accumulate energy 
from infrared light and 
re-emit it as visible or 
ultraviolet light. 
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Last December, Marta Cerruti, a biomateri- 
als scientist at McGill University in Montreal, 


which a lanthanide-containing nanoparticle is 
coated with a gel that contains a ‘drug’ — for 
testing purposes, a compact, stable protein’, 
After absorbing near-infrared light, the nano- 5. 
particle emits infrared, visible and ultraviolet 
light simultaneously. The infrared emission 
allows the researchers to track the nanoparti- 
cle’s location, and the ultraviolet light cleaves 
the protein's bond to the gel and releases it — or 
at least, it has in the laboratory. Cerruti’s group 
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is now planning tests in animals. 

At the end of the day, quantum 
dots are still the nanolights to beat. 
“They are the de facto standard,’ says 
Chan. “A lot of the fundamental phe- 
nomena concerning light emission 
are established in quantum dots and 
it shapes the way others explain what 
they see” 

Quantum dots are also still a 
research frontier. For example, they 
are getting a boost from relatively 
new semiconducting materials such 
as the perovskites. Unlike conven- 
tional semiconductors, which have 
a fixed ratio of elements, perovskites 
can have variable ratios, so research- 
ers can tailor the dots’ emission by 
varying their composition as well as 
their size. “They have two degrees of 
freedom for tunability,’ says Edward 
Sargent, a materials engineer at the 
University of Toronto, Canada. 

Last year, Sargent reported a hybrid 
material in which quantum dots are 
held within a perovskite’, yielding 
the kind of high brightness and good 
electron mobility that manufactur- 
ers might like for use in flat-screen 
displays. 

Other researchers are hoping to 
combine the best properties of each 
component by pursuing hybrid nano- 
lights. Bin Liu, for example, is trying 
to blend AIE-dots with quantum dots 
to produce narrow emissions. And 
semiconducting polymers paired with 
AIE-dots can produce much brighter 
particles than either alone”. 

Another grand challenge for nano- 
lights is to create versions that emit 
infrared wavelengths efficiently. That 
would open up applications in motion 
sensing, from tiny detectors that tell 
the screen to turn off when a mobile 
phone is lifted to the ear to sophisti- 
cated devices for self-driving cars and 
home monitoring for elderly people. 
“There's so much more we could do,” 
says Sargent. m 


XiaoZhi Lim is a freelance writer in 
Singapore. 
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An abandoned supermarket in the Fukushima prefecture in July 2015. 


Five years on from 
Fukushima 


To build sustainability and trust, energy and environment research in Japan must 
become more interdisciplinary and global, say Masahiro Sugiyama and colleagues. 


} | ext week will mark five years since 
11 March 2011, the day of the 
devastating Tohoku earthquake 
and tsunami, and the accident that followed 
at the Fukushima Daiichi nuclear-power 
plant of the Tokyo Electric Power Com- 
pany. The quake and tsunamis killed nearly 
16,000 people and injured more than 6,000; 
2,600 are still missing. 
Fukushima prefecture, the location of the 
crippled nuclear plant, was hit particularly 


hard". The Japanese government provisionally 
rated the severity of the accident on par with 
the 1986 Chernobyl disaster — a seven on the 
seven-point International Nuclear and Radio- 
logical Event Scale’. Around 110,000 people 
had to evacuate because of dispersed radio- 
nuclides. Despite the large-scale decon- 
tamination efforts, about 70,000 former 
residents are yet to return. 

Shocked by the fallout, Japan changed its 
energy policy. The year before the disaster, 
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the 2010 Basic Energy Plan had called for 
53% of electricity generation to come from 
nuclear power by 2030, implying significant 
new construction. Since the accident, Japan's 
energy policy has featured an expanded role 
for renewables and market liberalization — 
transitioning from a regional-monopoly 
model to one that is open to competition. 
(Some policy changes were made after the 
change in government in December 2012.) 
In July 2012, reflecting the public’s 
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> desire for a transition towards renewable 
energy, Japan introduced a feed-in tariff, 
which guarantees renewable-energy genera- 
tors a high price for the electricity that they 
feed into the grid; it was particularly gener- 
ous for solar photovoltaics. Installed solar 
capacity more than quadrupled in the first 
three years. A whopping 80-gigawatt capac- 
ity — 90% greater than Japan's nuclear-power 
capacity — has been approved. Next month, 
the retail electricity market will be fully lib- 
eralized, attracting a huge number of new 
entrants to the previously closed market. 

The government also strengthened nuclear 
security substantially. A newly created, inde- 
pendent Nuclear Regulation Authority insti- 
tuted new safety measures in 2013. All of 
Japan's nuclear-power plants halted operation 
for inspection after the accident, and the share 
of nuclear in power generation dropped from 
about 30% in 2010 to zero in 2014. Only 4 of 
the 44 reactors have been restarted so far. 

The public debate on nuclear rumbles on. 
Still an important resource for this resource- 
poor nation, it is expected to provide 20-22% 
of electricity by 2030, under a new long-term 
energy strategy created in conjunction with 
Japan's pledge to the United Nations to cut 
greenhouse-gas emissions. 


WEAK LINKS 

The journey since 2011 has been difficult, 
with policy controversies on every front. 
Many of the fraught decisions — on evacua- 
tion, clean-up, energy transition and disaster 
preparedness — were at the science-policy 
interface*. And scientists, especially those 
involved in giving policy advice, lost cred- 
ibility and the trust of the public’. 

Several initiatives have been launched to 
rebuild these crucial bridges. One such is an 
effort by the Japan Science and Technology 
Agency (JST) on research into scientific 
advice, and a ‘deliberative polling’ exercise 
in August 2012 involving the public that was 
used to inform the energy policy of the pre- 
vious government. These efforts are yielding 
genuine progress, but slowly. 

We strongly believe that the events and 
aftermath of 11 March highlighted a fun- 
damental problem with research in Japan: 
weak connections between disciplines and 
between Japan's scholars and those working 
in other countries. In a nation that performs 
world-class research in conventional disci- 
plines’, interdisciplinary scholarship lags, 
and Japanese researchers are keenly aware of 
this. Moreover, the nation’s breadth of disci- 
plinary coverage is narrower’ and the rate of 
international collaboration is lower than in 
comparable nations (see ‘Build bridges’). 

This has a particularly important implica- 
tion for energy and environment research, 
which require the integration of diverse 
knowledge®” that can come from anywhere 
in the world. During the Fukushima crisis, 
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researchers who were not used to collaborat- 
ing with other disciplines (or other nations) 
struggled to do so*. 


TWO CASE STUDIES 
Two examples illustrate the problem. The 
first concerns assessing the safety of nuclear- 
power plants. Probabilistic risk assessment 
(PRA) is a standard tool to quantitatively 
evaluate the likelihood of severe accidents 
and their impacts, using analysis methods 
such as fault trees and Monte Carlo simula- 
tion. Before the earthquake, nuclear experts 
conducting PRA research in Japan focused on 
internal events at nuclear plants (mechanical 
component failures and human errors), deal- 
ing mostly with engineering knowledge. 
What the disaster vividly demonstrated is 
that nuclear plants are susceptible to external 
events and that accident impacts are not con- 
tained — they may include the release of radi- 
onuclides, with dire environmental effects. 
The PRA has therefore been extended beyond 
nuclear engineering to cover disciplines rang- 
ing from seismology and geology to atmos- 
pheric science and ecological modelling. 
Before 2011, such interdisciplinary PRA 
research in Japan was limited compared with 
other developed economies that have signifi- 
cant nuclear presence such as in the United 
States, the United Kingdom and France (see 


BUILD BRIDGES 


The breadth of disciplinary coverage is narrower in 
Japan than in nations that have similar research 
systems, and international collaboration is lower. 
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go.nature.com/sev7 Lo; in Japanese). This was 
partly because the country did not require 
PRA for regulatory purposes. Under the new 
2013 regulations, Japan mandated the use of 
PRA for nuclear plants and is now trying to 
catch up in this area. 

The second example concerns innovation 
in renewable energy. Although many citizens 
would prefer the nation’s energy portfolio to 
have a larger share of renewables, these are 
more costly in Japan than elsewhere, even for 
technologies such as solar photovoltaics, in 
which Japan was a pioneer. 

Ideally, Japan should explore how each 
policy alternative might affect future cost 
trends for solar. Combining energy-systems 
analysis and policy analysis with technol- 
ogy forecasting methods (based on expert 
elicitation, bottom-up engineering analysis 
and learning curves that describe empirical 
relationships between cumulative produc- 
tion and cost reduction) would yield crucial 
insights. Such interdisciplinary studies in 
Japan are hard to come by. Because of the 
low level of global networking in these areas, 
international experiences are not widely 
appreciated in Japan either. 

This has already affected the solar market 
in Japan. Under the feed-in tariff, developers 
rushed to install expensive solar devices, cost- 
ing consumers trillions of yen that could have 
been saved by gradual installations made in 
tandem with cost reductions. In Germany, by 
contrast, there was a clear incentive for solar 
developers to reduce cost under its fine-tuned 
tariff scheme with frequent price adjustments. 

Our critics will say that these issues are 
political, not academic. We feel that this atti- 
tude is the source of the problem. Engaged 
scholarship is a prerequisite for informed 
policymaking. Scientists and social scientists 
must do their part. 


TWO FIXES 

Two big changes would go a long way to 
improving interdisciplinary research in 
energy and the environment in Japan. Going 
global is the key, and will pay dividends: Japan 
would leverage international expertise, and 
the rest of the world would learn from Japan's 
experiences. 


Globalize the review process. Because of the 
small number of researchers engaged in inter- 
disciplinary research, the pool of reviewers for 
academic journals and funding proposals is 
limited. In policy-relevant interdisciplinary 
research, particularly in energy and environ- 
ment, publishers and granting programmes, 
such as the government-backed KAKENHI 
(Grants-in-Aid for Scientific Research), 
should make parts or all of their review pro- 
cesses international. The connections made 
could also boost international collaboration. 

For strategic research in energy and the 
environment, funding agencies should 
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Employees, and thousands of paper cranes, inside the Fukushima plant three years after the accident. 


require scientists to publish part of their 
results in international journals even for 
policy-oriented research, whose target read- 
ership obviously prefers Japanese to English. 
Many papers, although tailored to the policy 
context of Japan, would appeal to global 
experts because energy and environmental 
issues are global. Large-scale programmes 
of the science-and-technology ministry, 
and strategic research funds of the environ- 
ment ministry, should take the lead. Because 
policymakers need deliverables to be commu- 
nicated in Japanese as well, this will increase 
the burden on researchers, which should be 
reflected in their funds. 

In April, Japan will start a new round of 
its Science and Technology Basic Plan, a 
cabinet-level, five-year policy on research 
and innovation. It is good to see that building 
an international researcher network is one of 
the key agenda items. Japan must make that 
vision into a reality. 


Globalize research. Strategic, policy-oriented 
research programmes in Japan should be 
designed so that they can benefit from inter- 
national experience and domestic experi- 
ence can be shared globally. For example, 
the Collaborative Laboratories for Advanced 
Decommissioning Science (CLADS), estab- 
lished as a research base for the decommis- 
sioning of the Fukushima plant, should be 
more outward-facing. 

Decommissioning involves many dis- 
ciplines, including nuclear engineering, 
meteorology and oceanic-risk assessments, 
ecology and remediation. By soliciting 


international research proposals, CLADS 
should involve more researches from else- 
where in Asia, where many countries have 
nuclear ambitions, including China, South 
Korea, India and many southeast Asian 
countries. Working with overseas scientists, 
CLADS should publish some outcomes in 
English. 

Another opportunity is Future Earth, a ten- 
year global sustainability research initiative 
that puts interdisciplinarity at the forefront 

alongside stake- 


“Going global holder engagement’. 
is the key, For its contribution, 
and will pay Japan should elevate 


dividends.” energy research to a 
key component. Japan 
has several advanced energy technologies, but 
to move them into the market at scale requires 
outside input, particularly when it comes to 
innovation policy. 

As Asia becomes the centre of the global 
energy economy, the time is ripe for Japan, as 
part of the Future Earth platform, to embark 
on atruly interdisciplinary and international 
project, and colleagues from neighbouring 
nations should do the same. Such initiative 
should receive rigorous academic oversight 
from an international advisory body. 


BETTER TOGETHER 

This year is also the 30th anniversary of the 
Chernobyl nuclear accident. In Europe, and 
Germany in particular, that disaster spawned 
fresh thinking on many fronts. The German 
book Risk Society by Ulrich Beck, published 
in 1986 soon after the accident, explored how 


© 2016 Macmillan Publishers Limited. All rights reserved 


risks from technology and industrialization 
shape modern society. 

The disaster catalysed a transition away 
from nuclear to renewables, which is now 
gathering renewed momentum, backed up 
by interdisciplinary studies on energy trans- 
formation. As in Germany in the late 1980s, 
Japan has seen many fresh attempts to carve 
out new directions for research, but so far 
such efforts have been fragmented and scat- 
tered, many along disciplinary lines. 

Five years on from March 2011, problems 
abound. Fukushima and the Tohoku areas 
are yet to recover, and the transition towards 
renewables has been rocky. Most, if not all, 
of these issues are fundamentally political 
and socio-economic’. But scientists, social 
scientists and their funders must engage. 
Without better connections across disci- 
plines and nations, the science-policy inter- 
face cannot improve. The people of Japan 
deserve better. m 
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CORRECTION 

The Comment ‘Current climate models are 
grossly misleading’ (Nature 530, 407-409; 
2016) gave the wrong citation for the review 
about IAMs use in climate economics. 

The correct reference is J. D. Farmer et al. 
Environ. Res. Econ. 62, 329-357 (2015). 
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Baby boom: new arrivals at a London hospital in 1945. 


PUBLIC HEALTH 


Tracing the social 
roots of health 


Andrew Steptoe applauds a cogent exploration of Britain’s 
groundbreaking longitudinal birth-cohort studies. 


elen Pearson's fascinating The Life 
Hoe: examines the history and 
legacy of the British birth-cohort 

studies, the first of which launched 70 years 
ago. These longitudinal investigations follow 
large samples of people ‘recruited’ as infants, 
periodically gathering data relevant to social 
and psychological development, schooling, 
employment and mental and physical health. 
These studies — of, collectively, some 
70,000 people — have played a crucial 
part in identifying how socio-economic 
circumstances drive inequalities in health and 
development. They have informed health, 
education and social policy, and provided 
a template for the ‘life course’ approach to 
health and development, studying how early 
experiences shape later outcomes. They are 
the envy of the world, and have contributed to 
topics as diverse as the perinatal determinants 
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of adult health, the establishment of free nurs- 
ery places for British 3- to 4-year-olds and the 
drive to promote adult literacy and numeracy. 

Pearson, Nature’s chief features editor, 
reveals that the studies emerged haphaz- 
ardly. The first — the 
National Study of 
Health and Develop- 
ment (NSHD), follow- 
ing 5,362 people born 
in England, Scotland 
or Wales in 1946 — 
was stimulated by 
1930s concerns about 
the falling birth rate, 


but b , ironically, The Life Project: 
: ea ey The Extraordinary 
a erent . Story of Our 


post-Second World 
War baby boom. 
The National Child 


Ordinary Lives 
HELEN PEARSON 
Allen Lane: 2016. 
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Development Study (NCDS), following 
17,000 people born in 1958, aimed to address 
perinatal mortality, and collected data on 
deaths as well as live births. Perinatal health 
was also a major issue for the British Cohort 
Study (BCS70), following 17,000 born in 
1970. The Millennium Cohort Study (MCS) 
follows 19,000 people born in 2000-02. Pear- 
son also covers the local Avon Longitudinal 
Study of Parents and Children (ALSPAC) 
in southwest England — which focuses on 
14,000 women recruited while pregnant in 
1991-92, and their children. In addition, she 
looks at the Life Study, which was planned to 
begin in 2012 but never got off the ground. 
Along with providing key information 
about maternal and neonatal health, the stud- 
ies evolved wider remits. James Douglas, first 
director of the 1946 study, set the standard 
by collecting data on social and economic 
conditions, father’s occupation, diet, health, 
temperament and behaviour, as well as birth 
weight and infant health and survival. Each 
cohort developed its own identity, a product 
of the principal investigators interests and the 
era in which it was launched. As the NSHD’s 
cohort turns 70 this year, the study is embrac- 
ing detailed measures of brain function, 
cardiovascular activity and physical capabil- 
ity. ALSPAC, begun by epidemiologist Jean 
Golding in the era of molecular medicine, 
included collection of DNA and other biolog- 
ical samples from the beginning. BCS70, by 
contrast, is collecting biomarkers for the first 
time this year, when its participants are 46. 
The NCDS moved towards social issues 
such as education when formidable educa- 
tional psychologist Mia Kellmer Pringle took 
the lead. Its findings on the enduring impact 
of social and economic adversity on child 
development have fed into national debates 
about social mobility and cycles of depriva- 
tion. The MCS, initially led by demographer 
Heather Joshi, is now being used to investigate 
emerging twenty-first-century issues, such 
as the growth of childhood obesity, parental 
involvement in learning and the impact of 
birth season on educational attainment. 
Some of the studies have had a bumpy ride, 
as Pearson relates. The 1946 cohort is over- 
seen by the UK Medical Research Council 
(MRC), and this ensured regular assessment 
throughout childhood. But contact was not 
maintained with the 1958 sample, so partici- 
pants had to be traced from scratch for data 
sweeps at ages 7 and 11. The BCS70 launched 
with funding from 23 organizations, then 
was Virtually appropriated by the ebullient 
but eccentric paediatrician Neville Butler. He 
was able to continue the study by cobbling 
together funding from diverse sources such 
as wealthy aristocrats, philanthropists, actors 
and celebrities. He designed many of the data 
sweeps himself — and published results spo- 
radically. The study almost fell off the map 
until it was rescued by the Social Statistics 
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Research Unit at London's City University in 
1996. Perhaps even more extraordinary was 
the MCS, a mainly political initiative by the 
government of then-prime minister Tony 
Blair, devised to celebrate the millennium. 
Funding was not agreed until early in 2000, so 
planning was rushed and the design far from 
ideal: when the participants were recruited, 
the babies were already nine months old. 
Pearson concentrates on social history 
rather than research findings, although she 
does highlight impacts on social and edu- 
cational policy and medical practice. This is 
wise, because the science is difficult to synthe- 
size: thousands of papers have been written 
on the studies’ biomedical, social, economic 
and educational ramifications. Pearson 
inserts stories of participants such as Steve 
Christmas of the 1958 cohort, who overcame 
the disadvantages of a limited education with 
hard work and determination. These provide 
vivid accounts of growing up in different dec- 
ades, and the roles of educational opportunity 
and psychological outlook in shaping lives. 
As Pearson explains, the studies have had 
to fight for funding, owing in part to a ten- 
sion between the need to maintain a sequence 
of measurements to understand how people 
change as they age, and the need to generate 
new hypotheses and collect new data in every 
funding period. At one time or another, each 
investigation was at risk of disbanding. 
Which brings us to the melancholy tale of 
the Life Study. Planned by paediatric epidemi- 
ologist Carol Dezateux to begin in 2012 with 


80,000 participants, 

“Findings on the project aimed to 
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blood, urine, saliva 
and faeces; and 
recordings of behaviour. It was immensely 
complex, balancing the demands of medi- 
cal and social scientists, and costs spiralled. 
Recruitment finally began early in 2015, but 
uptake was low, and the UK Research Coun- 
cils withdrew funding within months. 

It seems likely that there will be no new 
national birth cohorts any time soon. Yet it is 
encouraging that major funders such as the 
MRC and the Economic and Social Research 
Council have recognized the value of such 
studies. They are an important national 
research resource, and The Life Project does a 
great service in bringing them and the people 
at their heart to life for a general readership. m 


Andrew Steptoe is director of the Institute 
of Epidemiology and Health Care and 
British Heart Foundation professor of 
psychology at University College London. 
e-mail: a.steptoe@ucl.ac.uk 
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Dark Territory: The Secret History of Cyber War 


D SCHUSTER (2016) 
nd drones of on-the-ground warfare, a shadow 


conflict is playing out in a virtual theatre of war. Covert cyberattacks 


al systems that control everything from dams 
ium-enrichment facilities — are a new norm. 


Pulitzer-prizewinning journalist Fred Kaplan’s taut, urgent history 
traces the dual traject 


ory of digital surveillance and intervention, and 


high-level US policy from the 1980s on. In 2014 alone, he reveals, 
almost 80,000 US security breaches occurred — an artefact of the 
very network connectivity that enriches the country’s economy. 


Being a Beast 

Charles Foster PROFILE (2016) 

Humanity’s obsession with animals spans shamanic ‘possession’ by 
wolves and a hefty tranche of children’s literature. Charles Foster’s 
contribution might just stand alone. In his exquisitely strange, often 
hilarious chronicle, the writer recounts neuroscientific self-experiments 
centred on immersion in the sensory maelstrom experienced by 
iconic British species. He gets down and dirty, chomping earthworms 
and sleeping in a homemade sett (badger); honing his olfaction and 
attempting to hunt voles (urban fox); and parachuting (swift). A bold, 
unsettling try at comprehending the rest of life on Earth. 


Meathooked 

Marta Zaraska BASIC (2016) 

From char siu to boeuf bourguignon, meat has us hooked, proves 
journalist Marta Zaraska. Starting 1.5 billion years ago, when one 
bacterium first engulfed another, she zips through the evolution of 
human carnivory and examines the enduring pull of animal flesh 

by way of genetics, developmental biology, chemistry and nutrition. 
Zaraska negotiates the complexities nimbly, from meat’s pivotal part 
in building our big brains to the 1,000 substances that underpin its 
cooked odour (3-octen-2-one, for instance, smells of “crushed bugs”) 
and the unpalatable influence of the industry on research. 


Raptor: A Journey Through Birds 

James Macdonald Lockhart FOURTH ESTATE (2016) 

Fifteen species of raptor — from hen harrier to red kite — form 
nuclei for the crystalline narratives of this meditation on the British 
wild, a winner of the 2011 pre-publication Jerwood Award for Non- 
Fiction. James Macdonald Lockhart grew up knowing birds through 
photographs shot by his great-grandfather, Seton Gordon. His own 
understanding of raptor ethology shines. His journey — intercut with 
passages by Victorian ornithologist William MacGillivray — flings us 
into skies where a hobby ‘concertinas’ the air, or a marsh harrier’s 
ruff give it the air of an Elizabethan grandee. 


Eco-Homes: People, Place and Politics 

Jenny Pickerill ZED (2016) 

We have the science, technology and political will to build eco- 
housing — as well as to retrofit our not-so-green abodes. What stops 
us? The greatest hurdle is often cultural adaptation to a new norm, 
concludes environmental geographer Jenny Pickerill in her cogent 
sociopolitical work. Focusing on self-build eco-housing, Pickerill 
looks at more than 30 case studies from a range of countries, along 
with lessons learned on construction materials, geography and 
climate, gender, costs and the right policy for rapid rollout. Barbara Kiser 
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Interactions between predators and prey, such as lynxes and hares, can be modelled with biological rules. 


Biology distilled 


Brian J. Enquist reflects on a blueprint to guide 


the recovery of life on Earth. 


( a biology become as predictive as the 
physical sciences? And could it guide 
us in feeding the planet and reversing 

ecosystem degradation, climate change and 

species extinction? These are the overarching 
questions that organize evolutionary biolo- 

gist Sean Carroll's The Serengeti Rules — a 

compelling read filled with big, bold ideas. 

Biology is complicated. A cursory look at 

a diagram of the biochemical pathways in a 

cell or interactions between predators and 

prey in an ecosystem reveals myriad net- 
works. Indeed, following the cascading effect 
of genes on a phenotype in or between indi- 
viduals suggests that biological processes are 
almost unfathomably complex and idiosyn- 
cratic. Over the past two decades, those study- 
ing such systems have argued that most can be 
described as a network of interactions. How- 
ever, this analogy, although powerful, does 
not necessarily tell us about the resilience ofa 
given system. Nor does it reveal the generali- 
ties of the natural world and the regulation of 
genes, populations and ecosystems. 

Carroll tells a richer story. He dispels the 
idea that biology is 


too complex to gen- OD NATURE.COM 
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unification of biology 
that has occurred in 
the shadows of more 
well known work such 
as that by Charles Dar- 
win, James Watson 
and Francis Crick. 
Through compel- 
ling storytelling, key 
insights of distant, 
isolated biologists are 
brought to life. 
Carroll finds the 
common thread in 
discoveries in anat- 
omy, physiology, gene 
regulation and cancer 
research. He does so by way of Nobel-prize- 
winning molecular biologist Jacques Monod, 
Janet Rowley, ‘mother of chromosome genet- 
ics, and ecologists such as Tony Sinclair, 
who has helped to parse the ecology of the 
Serengeti region in Tanzania and Kenya. 
Carroll distills this body of knowledge 
into principles. He argues that at all scales of 
organization, biology is regulated through 
axioms of interactions in networks — from 
the number of molecules in our bodies to 
the numbers and kind of animals and plants 
in and across ecosystems. He boils all of 
biology down to six rules of regulation (his 


The Serengeti 
Rules: The Quest 
to Discover How 
Life Works and 
Why It Matters 
SEAN B. CARROLL 
Princeton University 
Press: 2016. 
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“Serengeti Rules”), which he shows are appli- 
cable both to the restoration of ecosystems 
and to the management of the biosphere. 

The same rule may carry different names 
in different biological contexts. The double- 
negative logic rule, for instance, enables 
a given gene product to feed back to slow 
down its own synthesis. In an ecosystem the 
same rule, known as top-down regulation, 
applies when the abundance ofa predator 
(such as lynx) limits the rise in the popu- 
lation of prey (such as snowshoe hares). 
This is why, in Yellowstone National Park 
in Wyoming, the reintroduction of wolves 
has resulted in non-intuitive changes in 
hydrology and forest cover: wolves prey 
on elk, which disproportionately feed on 
streamside willows and tree seedlings. It is 
also why ecologists can continue to manage 
the Serengeti, and have been able to ‘rebuild’ 
a functioning ecosystem from scratch in 
Gorongosa National Park, Mozambique. 

Carroll argues that the rules regulat- 
ing human bodily functions — which have 
improved medical care and driven drug dis- 
covery — can be applied to ecosystems, to 
guide conservation and restoration, and to 
heal our ailing planet. His Serengeti Rules 
encapsulate the checks and balances that 
minimize boom-and-bust cycles of species 
outbreaks and ecosystem imbalances. Eco- 
logical systems that are missing key regula- 
tory players, such as predators, can collapse; 
if they are overtaken by organisms spread by 
human activities, such as the kudzu vine, a 
‘cancer-like growth of that species can result. 

Some of Carroll’s recommendations are 
still being debated (see R. D. Grubbs et al. 
Sci. Rep. 6, 20970; 2016). Other pertinent 
work, such as findings by evolutionary ecol- 
ogist Daniel Janzen on forest restoration, is 
already commonly used in conservation. 
Nonetheless, I suspect that many will find 
new insights and inspiration here. 

Carroll has made a strikingly clear case that 
ecology is a science on a par with molecular 
biology and genetics. In many ways, this book 
is a homage to Charles Elton, who helped to 
define ecology as the study of species inter- 
actions in a ‘trophic’ network shaped by the 
environment (see E. Marris Nature 459, 327- 
328; 2009). Building on his vision, Carroll 
provides a passionate motto for the twenty- 
first century: “better living through ecology”. 

Are the Serengeti Rules a panacea? No, 
but Carroll convincingly reveals them to be 
a sturdy foundation for the future of biology, 
for human well-being, and for conservation 
and management. m 


Brian J. Enquist is in the department of 
ecology and evolutionary biology at the 
University of Arizona in Tucson, and is an 
external professor at the Santa Fe Institute in 
New Mexico. 

e-mail: benquist@email.arizona.edu 
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Japan justifies 
whaling stance 


As Japan’s commissioner to 

the International Whaling 
Commission (IWC), I disagree 
that the |WC’s review process of 
scientific whaling is “a waste of 
time” (A. Brierley and P. Clapham 
Nature 529, 283; 2016). 

The process comprises an 
independent expert-panel review 
and a wider review by the IWC 
Scientific Committee. Research 
proponents have no say in the 
expert panel's conclusions. Japan 
has given due regard to the IWC’s 
criticisms after peer review of its 
NEWREP-A proposal, which gave 
the scientific rationale for lethal 
sampling (see go.nature.com/ 
wapxyb; go.nature.com/vxjaz6). 

Brierley and Clapham say that 
Japan failed to alter its research 
plans “in any meaningful way” 
following recommendations by 
the IWC Scientific Committee 
that it should explore widely used 
non-lethal alternatives. In fact, 
those methods were included in 
the research plans for evaluation 
in light of the research objectives. 
As the International Court of 
Justice recognized in 2014, certain 
data cannot be obtained by non- 
lethal methods (see go.nature. 
com/fboxrt). Japan's new research 
programme includes both lethal 
and non-lethal research methods. 

The authors’ allegation that 
Japan’s whaling is “ostensibly” for 
research is no basis for proper 
scientific debate. Japan has made 
clear that it is always willing to 
answer questions on its research 
programme (see go.nature.com/ 
dut2kx), and looks forward to 
constructive scientific discussion 
at the committee's June meeting. 
Joji Morishita National Research 
Institute of Far Seas Fisheries, 
Shizuoka, Japan. 
jmorishita@affrc.go.jp 


NIH push to stop 
sexual harassment 


As the leading US government 
funder of scientific research, we at 
the National Institutes of Health 


(NIH) are deeply concerned 
about sexual harassment in 
science (Nature 529, 255; 2016). 
With the help of colleagues in 
government, academia and the 
private sector, the NIH aims to 
identify the steps necessary to 
end this in all NIH-supported 
research workplaces and scientific 
meetings. 

In September last year, we 
restated our expectation that 
organizers of NIH-supported 
conferences and meetings should 
assure a safe environment, 
free of discrimination (see 
go.nature.com/zmukk8). 

Over the next few weeks 
to months, we plan to work 
with governmental, academic 
and private-sector colleagues 
to identify potential steps to 
translating our expectations into 
reality. An important first step 
will be to gather as much data as 
possible to more fully understand 
the nature and extent of sexual 
harassment among scientists. 
These data should guide us 
in determining what kinds of 
policy and procedure are most 
likely to help. We will also work 
to determine what levers are 
already available to influential 
stakeholders — us as funders, as 
well as university administrators 
and departments, journal editors, 
and organizers and hosts of 
scientific meetings. 

We owe this to our colleagues 
and the public, who trust in our 
ability to make the biomedical 
research enterprise the best that 
it can be. 

Michael Lauer, Hannah 
Valantine, Francis S. Collins 
National Institutes of Health, 
Bethesda, Maryland, USA. 
michael.lauer@nih.gov 


Australians rush to 
reject primate bill 


A bill introduced in the 
Australian Senate proposes an 
amendment that would prohibit 
imports of live non-human 
primates for research purposes 
(see Nature http://doi.org/bcqx; 
2016). We call for the Senate 


to reject this bill in support of 
ethically conducted research and 
preserving the animals’ long- 
term health, for which exchange 
between international breeding 
facilities is crucial. 

The bill was referred for 
enquiry to the Senate Legislation 
Committee for Environment and 
Communications, which sought 
public submissions in late 2015 
(see go.nature.com/mjahre). 
Three days before the first public 
hearing on 5 February, only 2 out 
of 56 submissions argued against 
the amendment. 

At that point, we contacted the 
Australian scientific community 
— heads of research institutes and 
those working with non-human 
primates — and discovered that 
they were largely unaware of the 
proposed legislation. 

A flurry of written submissions 
and last-minute personal 
representations to members of 
the Senate followed, all calling 
for rejection of the bill. The 
international community took 
up the issue, with many scientific 
societies and research institutes 
reacting within 48 hours (see 
go.nature.com/oihuxp). 

The Senate committee accepted 
many submissions after the 
closing date and is due to submit 
its report early this month. The 
episode demonstrates the alacrity 
with which scientists, typically 
a reticent group, are prepared to 
engage with the political process 
when the issue is perceived as 
important for the advancement 
of science. 

Nicholas Price, James Bourne, 
Marcello Rosa Monash 
University, Melbourne, Australia. 
nicholas.price@monash.edu 


Transparency: issues 
are not that simple 


We find Stephan Lewandowsky 
and Dorothy Bishop’s framing 
of science governance to be 
overly simplistic and in need ofa 
firmer evidence base (‘Don't let 
transparency damage science’ 
Nature 529, 459-461; 2016). 

The authors’ analysis is biased 


by its reliance on testimonials 
from the narrow range of invited 
experts at last year’s Royal Society 
meeting (see go.nature.com/ 
zptirs). Complex issues associated 
with openness and transparency 
also need to be taken into account 
(see S. Jasanoff Law Contemp. 
Probl. 69, 21-45; 2006). 

The authors present important 
topics such as expertise, 
disciplinary boundaries and 
communication as simple 
dichotomies. These divisions 
overlook extensive nuanced 
evidence from the social-science 
literature about who counts 
as an expert and under which 
conditions (see, for example, 
go.nature.com/xdfzrn). 

In our view, governance 
issues around openness and 
transparency should not be 
framed only by the research 
community. The debate must 
also include representatives from 
across the broad range of public 
viewpoints. 

Warren Pearce, Sarah Hartley, 
Brigitte Nerlich University of 
Nottingham, UK. 
warren.pearce@nottingham.ac.uk 


Transparency: an 
opaque illustration 


We question the choice of the 
padlock and dagger illustration 
you used to open the discussion 
by Stephan Lewandowsky and 
Dorothy Bishop (‘Don't let 
transparency damage science’ 
Nature 529, 459-461; 2016). To 
us, this falsely implies that the 
article is about open access to 
journal publications and, by its 
association with the title, that 
open-access publishing presents 
a threat to science. 

The authors send no such scary 
message, which calls attention 
to the more general concepts 
of openness and transparency 
in providing access to original 
research data. 
Karen Shashok Granada, Spain. 
Remedios Melero CSIC Institute 
of Agrochemistry and Food 
Technology (IATA), Valencia, Spain. 
kshashok@kshashok.com 
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Reactions triggered electrically 


Single- molecule experiments have revealed that chemical reactions can be controlled using electric fields — and that the 
reaction rate is sensitive to both the direction and the strength of the applied field. SEE LETTER P.88 


LIMIN XIANG & N. J. TAO 


hemical reactions turn the elements in 

the periodic table into substances that 

make up everything we use, ingest and 
breathe, and into the molecules in our bodies. 
Developing new and more-efficient ways to 
control reactions has therefore been a constant 
quest for chemists. On page 88 of this issue, 
Aragonés et al.’ describe a new way to control 
chemical reactions. They report that a classic 
organic transformation called the Diels-Alder 
reaction’ can be accelerated using an electric 
field, and that the reaction rate sensitively 
depends not only on the strength of the field, 
but also on its polarity. 

Chemical reactions typically involve the 
transfer of electrons between atoms and mol- 
ecules, and rearrangements of the positions 
of nuclei. To control reactions, one must find 
a way to promote these processes, and many 
different driving forces have been used, includ- 
ing heat, light and pressure. It has been pre- 
dicted theoretically’ that reactions may also 
be controlled using an electrostatic field. This 
is possible because some covalently bonded 
species can be regarded as combinations of 
‘resonance contributors — molecular struc- 
tures in which the electrons of chemical bonds 
can be localized in various ways. A properly 
oriented electric field can accelerate or deceler- 
ate a chemical reaction by stabilizing or desta- 
bilizing these contributors. 

This mechanism for promoting reactions is 
distinctly different from that for electrochemi- 
cal reactions, in which an electric field tunes 
the energy levels of electrons in molecules 
such that electron transfer between the elec- 
trode and the molecules can occur. It is also 
different from cases in which an electric field 
affects a reaction by changing local concentra- 
tions of the reactants or catalysts (enzymes), 
or by bringing charged groups of biological 
molecules close to each other. 

Achieving an electric-field-induced reaction 
has been a challenge for two main reasons. 
First, the directions of the electric field and 
of the molecules must be aligned in a specific 
way; and second, a detection method must 
be used that is sensitive enough to meas- 
ure the enhanced reaction rate. Aragonés 
et al. used a scanning tunnelling microscopy 
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Figure 1 | A chemical reaction controlled by an electric field. Aragonés et al.’ attached a diene 
molecule (blue) to a sharp gold tip, and a dienophile (red) to a gold substrate. They observed that an 
applied electric field (arrow) increased the rate of reaction between the molecules, and that the rate 
increased with the field’s strength. An electric field pointing in the opposite direction does not affect the 


reaction rate (not shown). 


break-junction (STM-BJ) technique’ to 
create a strong electric field for controlling 
their Diels-Alder cycloaddition — a reaction 
that occurs between organic molecules called 
dienes and dienophiles (Fig. 1). They also used 
this approach to measure the reaction rate. 
The STM-BJ technique was developed to 
measure the electronic properties of single 
molecules in a vacuum, in air and in solutions 
at various temperatures, making it suitable for 
studying chemical reactions at the molecu- 
lar level. It involves precisely controlling the 
separation between a sharp gold tip and a gold 
substrate. In Aragonés and colleagues’ study, 
the gold tip was modified by attachment of 
a diene molecule, and the gold surface was 
modified with a dienophile (Fig. 1). When a 
voltage was applied between the tip and the 
substrate, the sharpness of the tip and the 
small separation between the tip and the sub- 
strate helped to generate a strong, directional 
electric field that aligned the reactants and 
facilitated a Diels-Alder reaction®. Because 
the STM-BJ technique can detect and ana- 
lyse a single molecule bridging the tip and 
substrate, from the current flowing through 
it, the authors were able to detect the reaction 
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product at the single-molecule level. 

A crucial parameter in the description of a 
chemical reaction is the reaction rate, which 
measures how many molecules can be pro- 
duced in a given time interval. By combining 
the experimental results with computational 
approaches’ , Aragonés et al. showed that, by 
increasing the applied electric field 15-fold, the 
rate of their reaction increased by up to 5-fold. 

Even more interestingly, the researchers 
observed no effects on the reaction rate in 
experiments in which the electric field was 
reversed, demonstrating the importance of 
the direction of the field. This is because the 
chemical reaction requires the electrons to 
flow from the dienophile to the diene, to form 
a resonance structure in which the dienophile 
bears a positive charge and the diene car- 
ries a negative charge. Only an electric field 
pointing from the diene to the dienophile can 
accelerate this process, thereby increasing the 
reaction rate. 

In their experiments, Aragonés and col- 
leagues used the sharp tip and flat substrate 
as geometrically asymmetric electrodes. This 
asymmetry might contribute to the reac- 
tion rate’s dependence on the electric-field 


direction. Heat might also be generated when 
current passes through the molecules during 
the STM-BJ measurements’, which could have 
an additional effect on the reaction rate. This 
should be investigated further. 

Although many details remain unexplored, 
the work provides the first experimental evi- 
dence that an electric field can control chemi- 
cal reactions. If this effect can be scaled up for 
commercially useful reactions on an industrial 
scale, it could have a huge economic impact. 
However, the STM-BJ set-up can create a 
large, directional electric field within only 
a tiny volume, and would not be suitable for 


EVOLUTION 


industrial applications — another technique 
would need to be developed. Nevertheless, the 
STM-BJ approach certainly provides a new 
way to study and control chemical reactions at 
the single-molecule level, and might provide 
unprecedented information about reaction 
mechanisms in the future. = 


Limin Xiang and N. J. Tao are at the Center 
for Biosensors and Bioelectronics, Biodesign 
Institute, Arizona State University, Tempe, 
Arizona 85287, USA.N.J.T. is also at the State 
Key Laboratory of Analytical Chemistry for 
Life Science, Nanjing University, China, and at 


Mitochondria 
in the second act 


A large phylogenomics study reveals that the symbiotic event that led to the 
emergence of organelles known as mitochondria may have occurred later in 
the evolution of complex cells than was thought. SEE LETTER P.101 


THIJS J. G. ETTEMA 


he development of eukaryotic cells, 

which contain a nucleus and other 

membrane-bound compartments, is 
one of the most enigmatic events in the evolu- 
tion of life on Earth’. A crucial momentin this 
process was the emergence of mitochondria. 
These organelles are thought to have formed 
when a bacterial cell began living inside an 
archaeal host cell — a form of endosymbiosis, 
a mutually beneficial relationship in which 
one organism lives inside another. The bac- 
terium is envisaged to have provided its host 
cell with additional energy’, and the interac- 
tion eventually resulted in a eukaryotic cell that 
contained genes of both archaeal and bacterial 
provenance. In this issue, Pittis and Gabaldén? 
(page 101) provide evidence that the host cell 
from which eukaryotes evolved was already 
genetically chimaeric before the mitochon- 
drial symbiosis, suggesting that mitochondria 
evolved later in eukaryotic evolution than was 
previously presumed. 

Mitochondria generate energy through 
oxidative-phosphorylation reactions and are 
therefore sometimes known as the ‘power- 
house’ of eukaryotic cells. Since their discovery 
more than 100 years ago, the origin of mito- 
chondria has been hotly debated. Currently, 
an overwhelming amount of evidence indi- 
cates that mitochondria are the result of a 
single endosymbiotic event, and that the mito- 
chondrial progenitor was related to the Alpha- 
proteobacteria™* (although to which group 
is still unclear). The archaeal host is thought 


to have been related to the Lokiarchaeota, a 
phylum of archaea that was recognized only last 
year’. Yet there is still uncertainty over when 
mitochondria emerged during eukaryotic-cell 
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evolution: was this an early, perhaps even 
initiating, event (‘mito-early’), or did it occur 
when the complexity of the eukaryotic cell was 
already largely established (‘mito-late’)®? 
According to conventional mito-late 
models, eukaryotes emerged before the 
mitochondrial endosymbiont was acquired’ 
(Fig. 1). But the popularity of these models 
has been decreasing with the realization’ that 
eukaryotes that do not have mitochondria, 
and that are thought to have diverged evo- 
lutionarily before mitochondria evolved, 
contain organelles that are degenerate but 
clearly derived from mitochondria, such as 
hydrogenosomes and mitosomes. The finding 
that all known eukaryotes have (or once had) 
mitochondria resulted in a wave of mito-early 
hypotheses, in which the interaction between 
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Figure 1 | Origin of eukaryotic cells and their mitochondria. Eukaryotic cells contain membrane- 
bound compartments, such as a nucleus and mitochondria. The latter are energy-generating organelles 
that are thought to have formed when a bacterial cell lived in a symbiotic relationship inside an archaeal 
cell. Models for the origin of eukaryotic cells are usually divided into ‘mito-early’ and ‘mito-late’ scenarios, 
depending on whether the mitochondrial endosymbiont was acquired early during eukaryotic evolution 
or when much of the complexity of eukaryotic cells was already established. Pittis and Gabaldén’ provide 
evidence for a ‘mito-intermediate’ scenario, in which the cell that hosted the mitochondrial endosymbiont 
displayed a degree of cellular complexity before the mitochondrial endosymbiosis. But, in contrast to 
conventional mito-late models, Pittis and Gabaldon’s results do not necessarily imply that the host cell was 
a fully fledged eukaryote. Hence, their findings are compatible with recent work that supports an archaeal 


origin for eukaryotes”. 
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direction. Heat might also be generated when 
current passes through the molecules during 
the STM-BJ measurements’, which could have 
an additional effect on the reaction rate. This 
should be investigated further. 

Although many details remain unexplored, 
the work provides the first experimental evi- 
dence that an electric field can control chemi- 
cal reactions. If this effect can be scaled up for 
commercially useful reactions on an industrial 
scale, it could have a huge economic impact. 
However, the STM-BJ set-up can create a 
large, directional electric field within only 
a tiny volume, and would not be suitable for 
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industrial applications — another technique 
would need to be developed. Nevertheless, the 
STM-BJ approach certainly provides a new 
way to study and control chemical reactions at 
the single-molecule level, and might provide 
unprecedented information about reaction 
mechanisms in the future. = 


Limin Xiang and N. J. Tao are at the Center 
for Biosensors and Bioelectronics, Biodesign 
Institute, Arizona State University, Tempe, 
Arizona 85287, USA.N.J.T. is also at the State 
Key Laboratory of Analytical Chemistry for 
Life Science, Nanjing University, China, and at 
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Figure 1 | Origin of eukaryotic cells and their mitochondria. Eukaryotic cells contain membrane- 
bound compartments, such as a nucleus and mitochondria. The latter are energy-generating organelles 
that are thought to have formed when a bacterial cell lived in a symbiotic relationship inside an archaeal 
cell. Models for the origin of eukaryotic cells are usually divided into ‘mito-early’ and ‘mito-late’ scenarios, 
depending on whether the mitochondrial endosymbiont was acquired early during eukaryotic evolution 
or when much of the complexity of eukaryotic cells was already established. Pittis and Gabaldén’ provide 
evidence for a ‘mito-intermediate’ scenario, in which the cell that hosted the mitochondrial endosymbiont 
displayed a degree of cellular complexity before the mitochondrial endosymbiosis. But, in contrast to 
conventional mito-late models, Pittis and Gabaldon’s results do not necessarily imply that the host cell was 
a fully fledged eukaryote. Hence, their findings are compatible with recent work that supports an archaeal 
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a primitive host cell and the mitochondrial 
endosymbiont was the main driving force for 
eukaryogenesis (Fig. 1). 

A syntrophic interaction — in which one 
species lives off the products of another — 
is often invoked in these models, and it is 
thought that the most profound outcome of 
this interaction was the reallocation of energy 
production from the host cell’s membrane to 
the mitochondrial membrane. This compart- 
mentalized energy management provided the 
host with a surplus of energy, which is sug- 
gested to have triggered the emergence of the 
complex cellular features that are characteris- 
tic of eukaryotes®. The result of this evolution- 
ary journey is proposed to have been the first 
eukaryotic cell, with a chimaeric genome’. 

However, although mito-early models have 
gained much support among evolutionary biol- 
ogists, the genomic chimaerism in eukaryotes 
hides a problem: most of the bacterial genes in 
eukaryotic genomes cannot be traced back to 
the alleged alphaproteobacterial ancestor of 
mitochondria. Instead, they seem to originate 
from various unrelated bacteria. Pittis and 
Gabaldon aimed to solve this mystery. 

By tracing phylogenetic signals of proteins 
that were present in the last eukaryotic com- 
mon ancestor (LECA), Pittis and Gabaldén 
identified different classes of protein according 
to the timing of their appearance in eukaryotes. 
In agreement with other findings that imply an 
archaeal origin for eukaryotes”, the authors 
found that the oldest LECA proteins are 
dominated by archaea-related proteins that 
are involved in essential cellular functions 
such as replication, translation and tran- 
scription. Furthermore, the most recently 
acquired LECA proteins are, unsurprisingly, 
dominated by bacterial proteins, most notably 
from alphaproteobacteria, that are primarily 
located in mitochondria and involved in energy 
generation. Most of these proteins prob- 
ably originate from the alphaproteobacte- 
rial ancestor of mitochondria. Intriguingly, 
however, Pittis and Gabaldon identified a third 
class of bacterial LECA protein that they infer 
was acquired before these mitochondrial 
proteins. Several of these proteins seem to be 
located in intracellular membrane systems, 
such as the endoplasmic reticulum and the 
Golgi apparatus. 

These findings shed light on the relative 
timing of the origin of mitochondria and 
the genomic nature of the host cell. First, the 
results imply that the host cell was already 
chimaeric before the mitochondrial endo- 
symbiosis. Second, the fact that several of the 
bacterial proteins that pre-date the mitochon- 
drial endosymbiosis operate in intracellular 
membrane systems suggests that the host cell 
already displayed a considerable degree of 
complexity, which is supportive of a relatively 
late mitochondrial origin (Fig. 1). 

However, Pittis and Gabaldén’s results raise 
a question: what, then, was the origin of the 
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bacterial genes that pre-date mitochondrial 
endosymbiosis? Clearly, these genes can no 
longer be explained by ‘inherited chimaerism” 
of the mitochondrial endosymbiont"’. The 
authors suggest that the genes may have been 
acquired through previous (endo)symbiotic 
interactions with different bacterial partners 
or by serial waves of horizontal gene transfer 
to the host genome. 

Although this question remains open, clues 
to an answer might come from genomes of the 
Lokiarchaeota phylum, members of which 
share acommon ancestry with eukaryotes. 
Analysis of a lokiarchaeal genome indicated 
that nearly 30% of its genes display greater 
similarity to bacterial than to archaeal genes’. 
Despite being extensively reshaped by evolu- 
tionary processes, some of the bacterial genes 
in the archaeal ancestor of eukaryotes could 
have ended up in the genomes of present- 
day eukaryotes. Future exploration of new 
lineages of archaea and eukaryotic microor- 
ganisms will provide yet more insight into 
the origin and early evolution of eukaryotes, 
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including the key role of mitochondria in 
this process. = 
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new astronomy 


The discovery of gravitational waves from a merging black-hole system opens 
a window on the Universe that promises to test gravity at its strongest, and to 
reveal many surprises about black holes and other astrophysical systems. 


M. COLEMAN MILLER 


hortly after Albert Einstein delivered his 

theory of gravity — general relativity — 

to the world in 1915, he discovered that 
binary stars and other sources should generate 
gravitational waves’”. Unfortunately, he also 
found that any imaginable source would pro- 
duce gravitational waves so weak that detec- 
tion was inconceivable using the technology 
of the day. But this inconceivable detection has 
now been reported by Abbott et al. (the LIGO 
Scientific Collaboration and the Virgo Col- 
laboration) in Physical Review Letters”. 

The authors describe the detection of the sig- 
nal GW150914 from gravitational waves gener- 
ated by the merger of two black holes (Fig. 1). 
These waves were detected from the temporary, 
tiny changes that they induced in the lengths of 
the two detectors of the US-based Advanced 
Laser Interferometer Gravitational- Wave 
Observatory (Advanced LIGO). The conse- 
quences of this detection are difficult to over- 
state, as is its promise for future advances and 
discoveries. Before this discovery, astronomers 
had only three types of messenger from space 
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beyond our Solar System: photons, neutrinos 
and high-energy cosmic rays. Gravitational 
waves can now be added to this short list. 
Moreover, some of the most violent events in 
the Universe can be seen only in gravitational 
waves. Consider, for example, the inspiral and 
merger of two black holes, such as the one that 
caused GW 150914. During the last moments 
of coalescence, the energy emitted in gravita- 
tional waves was tens of times larger than the 
energy emitted in those moments by all the 
stars in the Universe combined. But such an 
event is expected to be undetectable using any 
of the other messengers. Opening the gravi- 
tational-wave window will thus reveal events 
that had previously only been hypothesized. 
This window will also enable tests of gen- 
eral relativity in realms that were previously 
inaccessible. We can get an idea of the domain 
that can now be explored by considering a 
dimensionless quantity, GM/Rc’, that measures 
the importance of gravity for an object of mass 
M and radius R (G is Newton’s gravitational 
constant and c is the speed of light). For Earth, 
GM/Rc’ at the surface is approximately 
7 x 10°'°. For a star such as the Sun, GM/Rc’? 
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a primitive host cell and the mitochondrial 
endosymbiont was the main driving force for 
eukaryogenesis (Fig. 1). 

A syntrophic interaction — in which one 
species lives off the products of another — 
is often invoked in these models, and it is 
thought that the most profound outcome of 
this interaction was the reallocation of energy 
production from the host cell’s membrane to 
the mitochondrial membrane. This compart- 
mentalized energy management provided the 
host with a surplus of energy, which is sug- 
gested to have triggered the emergence of the 
complex cellular features that are characteris- 
tic of eukaryotes®. The result of this evolution- 
ary journey is proposed to have been the first 
eukaryotic cell, with a chimaeric genome’. 

However, although mito-early models have 
gained much support among evolutionary biol- 
ogists, the genomic chimaerism in eukaryotes 
hides a problem: most of the bacterial genes in 
eukaryotic genomes cannot be traced back to 
the alleged alphaproteobacterial ancestor of 
mitochondria. Instead, they seem to originate 
from various unrelated bacteria. Pittis and 
Gabaldon aimed to solve this mystery. 

By tracing phylogenetic signals of proteins 
that were present in the last eukaryotic com- 
mon ancestor (LECA), Pittis and Gabaldén 
identified different classes of protein according 
to the timing of their appearance in eukaryotes. 
In agreement with other findings that imply an 
archaeal origin for eukaryotes”, the authors 
found that the oldest LECA proteins are 
dominated by archaea-related proteins that 
are involved in essential cellular functions 
such as replication, translation and tran- 
scription. Furthermore, the most recently 
acquired LECA proteins are, unsurprisingly, 
dominated by bacterial proteins, most notably 
from alphaproteobacteria, that are primarily 
located in mitochondria and involved in energy 
generation. Most of these proteins prob- 
ably originate from the alphaproteobacte- 
rial ancestor of mitochondria. Intriguingly, 
however, Pittis and Gabaldon identified a third 
class of bacterial LECA protein that they infer 
was acquired before these mitochondrial 
proteins. Several of these proteins seem to be 
located in intracellular membrane systems, 
such as the endoplasmic reticulum and the 
Golgi apparatus. 

These findings shed light on the relative 
timing of the origin of mitochondria and 
the genomic nature of the host cell. First, the 
results imply that the host cell was already 
chimaeric before the mitochondrial endo- 
symbiosis. Second, the fact that several of the 
bacterial proteins that pre-date the mitochon- 
drial endosymbiosis operate in intracellular 
membrane systems suggests that the host cell 
already displayed a considerable degree of 
complexity, which is supportive of a relatively 
late mitochondrial origin (Fig. 1). 

However, Pittis and Gabaldén’s results raise 
a question: what, then, was the origin of the 
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bacterial genes that pre-date mitochondrial 
endosymbiosis? Clearly, these genes can no 
longer be explained by ‘inherited chimaerism” 
of the mitochondrial endosymbiont"’. The 
authors suggest that the genes may have been 
acquired through previous (endo)symbiotic 
interactions with different bacterial partners 
or by serial waves of horizontal gene transfer 
to the host genome. 

Although this question remains open, clues 
to an answer might come from genomes of the 
Lokiarchaeota phylum, members of which 
share acommon ancestry with eukaryotes. 
Analysis of a lokiarchaeal genome indicated 
that nearly 30% of its genes display greater 
similarity to bacterial than to archaeal genes’. 
Despite being extensively reshaped by evolu- 
tionary processes, some of the bacterial genes 
in the archaeal ancestor of eukaryotes could 
have ended up in the genomes of present- 
day eukaryotes. Future exploration of new 
lineages of archaea and eukaryotic microor- 
ganisms will provide yet more insight into 
the origin and early evolution of eukaryotes, 
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including the key role of mitochondria in 
this process. = 
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new astronomy 


The discovery of gravitational waves from a merging black-hole system opens 
a window on the Universe that promises to test gravity at its strongest, and to 
reveal many surprises about black holes and other astrophysical systems. 


M. COLEMAN MILLER 


hortly after Albert Einstein delivered his 

theory of gravity — general relativity — 

to the world in 1915, he discovered that 
binary stars and other sources should generate 
gravitational waves’”. Unfortunately, he also 
found that any imaginable source would pro- 
duce gravitational waves so weak that detec- 
tion was inconceivable using the technology 
of the day. But this inconceivable detection has 
now been reported by Abbott et al. (the LIGO 
Scientific Collaboration and the Virgo Col- 
laboration) in Physical Review Letters”. 

The authors describe the detection of the sig- 
nal GW150914 from gravitational waves gener- 
ated by the merger of two black holes (Fig. 1). 
These waves were detected from the temporary, 
tiny changes that they induced in the lengths of 
the two detectors of the US-based Advanced 
Laser Interferometer Gravitational- Wave 
Observatory (Advanced LIGO). The conse- 
quences of this detection are difficult to over- 
state, as is its promise for future advances and 
discoveries. Before this discovery, astronomers 
had only three types of messenger from space 
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beyond our Solar System: photons, neutrinos 
and high-energy cosmic rays. Gravitational 
waves can now be added to this short list. 
Moreover, some of the most violent events in 
the Universe can be seen only in gravitational 
waves. Consider, for example, the inspiral and 
merger of two black holes, such as the one that 
caused GW 150914. During the last moments 
of coalescence, the energy emitted in gravita- 
tional waves was tens of times larger than the 
energy emitted in those moments by all the 
stars in the Universe combined. But such an 
event is expected to be undetectable using any 
of the other messengers. Opening the gravi- 
tational-wave window will thus reveal events 
that had previously only been hypothesized. 
This window will also enable tests of gen- 
eral relativity in realms that were previously 
inaccessible. We can get an idea of the domain 
that can now be explored by considering a 
dimensionless quantity, GM/Rc’, that measures 
the importance of gravity for an object of mass 
M and radius R (G is Newton’s gravitational 
constant and c is the speed of light). For Earth, 
GM/Rc’ at the surface is approximately 
7 x 10°'°. For a star such as the Sun, GM/Rc’? 
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Figure 1 | A gravitational wave from merging black holes. Abbott et al.’ report the detection of a 
gravitational wave, which they attribute to the coalescence of two black holes. a, The wave was first detected 
at approximately 35 Hz, as it reached the sensitivity range of Advanced LIGO (the detecting observatory). 
At this point, the black holes were spiralling in towards each other. The depicted radii are proportional 

to the black-holes’ masses. b, The wave frequency increased as the black holes coalesced — at the point 

of merger, the black-hole horizons overlapped, but had not settled down to their final state. c, The wave 
dissipated as the merged black hole attained its final, simple configuration. The wave depicted here is based 
on observational data, but has been smoothed and fitted to a numerical model based on general relativity; 
strain represents the fractional changes in distance that are produced by the waves. (Adapted from ref. 1.) 


is still only about 2 x 10°°. Previous tests of 
general relativity have therefore been restricted 
to systems that have weak gravity. 

But at the event horizon of a black hole 
(the boundary beyond which nothing can 
escape the hole’s gravitational field), GM/ Re? 
is roughly 1, many orders of magnitude larger 
than for planets and stars. Gravity can thus be 
tested directly at its greatest strength for the 
first time, by analysing GW150914 and any 
other signals detected for similar mergers in 
the future. General relativity has passed the 
tests set by GW150914 with flying colours*. 
This signal has also provided the most direct 
confirmation yet of the existence of event 
horizons, which are unique to black holes. 

The discovery of GW150914 has profound 
consequences for astronomy. Previously known 
black holes that formed from a single star have 
quite a restricted mass range: the highest mass 
that was definitively established was found to 
be only about 15 times the mass of the Sun’. 
Analysis of GW150914 has doubled this mass 
record at a stroke (the merging black holes had 
masses 29 and 36 times that of the Sun‘), and 
then doubled it again (the final merged black 
hole is inferred to have a mass 62 times that of 
the Sun®). The spins of black holes are notori- 
ously difficult to measure, but Abbott et al. were 
able to infer the spin of the final black hole from 
their data: the 160-kilometre-radius black hole 
spins completely around 100 times per second, 
which is roughly 70% of the maximum possible 
rate for a black hole of this mass. 


None of this could have happened without 
spectacular developments in instrumentation. 
Gravitational waves distort space and time only 
slightly at our distance from any likely sources. 
The distortion is characterized by a dimension- 
less quantity called strain, which is the fractional 
change in distances produced by the waves. 
Even for a fairly strong event such as the black- 
hole merger, the change is tiny: the authors 
find a maximum value of just 10. This means 
that the 4-kilometre-long, L-shaped arms of 
Advanced LIGO change in length by about 
1/200 of the radius of a proton. Such changes 
can nonetheless be seen because of the exquisite 
precision of the optics of Advanced LIGO, the 
delicacy of its suspension and the power of its 
lasers, which all result from years of develop- 
ment — the LIGO detector has improved in all 
respects by orders of magnitude since its first 
conception more than 40 years ago. 

Even more encouragingly, further major 
improvements are just around the corner. 
During its next run in 2016, Advanced LIGO 
will be able to observe about three times the 
volume of space that it could in 2015, and in 
the next year or two the Advanced Virgo detec- 
tor in Italy will join the search for gravitational 
waves. A few years later, the Japanese Kamioka 
Gravitational Wave Detector will come online, 
and it is hoped that LIGO-India will join the 
hunt before 2025. This international network 
will also benefit from technological develop- 
ments in light manipulation such as those 
at the GEO600 detector in Germany. The 
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50 Years Ago 


The dilatory attitude of the 
Committee on Libraries set up by the 
University Grants Committee and 
the Ministry of Education, and the 
naive remark in the Robbins Report 
that “a library adequate to scholarly 
research is as essential to the efficient 
running of a university as an adequate 
range of computers’, fully justifies 
the anxiety expressed recently by 

the Master of Sidney Sussex College, 
Cambridge ... Universities, as Dr. D. 
Thomson pointed out, have managed 
for centuries without computers, but 
never without libraries. 

From Nature 5 March 1966 


100 Years Ago 


We notice in La Geographie 

for November, 1915, that the 
hydrographic department of the 
French Admiralty have replaced 

the German names in Kerguelen 

by names of French origin. It must 
be very galling to the French to see 
an abundance of German names 
scattered over the chart of their 
Antarctic island, especially as German 
explorers were never very sparing in 
their naming ... however, the practice 
of changing established names is a 
dangerous one if carried far, and it is to 
be hoped ... this principle will not be 
applied indiscriminately, for confusion 
would certainly be the result. 

ALSO: 

Those who are inclined to doubt 
whether museums play any useful 
part in war-time should read the 
account of what is being done in the 
Leicester Museum, by means ofan 
Infant Welfare Exhibition, to combat 
the appalling mortality among infants 
... This mortality, which is largely 
preventable, is brought out with 
startling vividness by means ofa series 
of wooden columns, that for infants 
up to twelve months old standing no 
fewer than 11 ft. high, while that for 
the death-rate between the ages from 
five to twenty is but 2% ofan inch high. 
From Nature 2 March 1916 


3 MARCH 2016 | VOL 531 | NATURE | 41 


| RESEARCH | NEWS & VIEWS 


resulting volume of space that can be explored 
will be tens of times greater than could be 
seen during the GW150914 detection, and 
will allow the direction of future events to be 
determined much more accurately than was 
possible for GW 150914. 

Surprises undoubtedly await, particularly 
given that the ability to detect gravitational 
waves at new frequency ranges is being devel- 
oped in facilities such as the space-based 
Evolved Laser Interferometer Space Antenna 
and the ground-based International Pulsar 
Timing Array, and for various experiments 
that are studying the polarization of the cosmic 


STEM CELLS 


microwave background (the oldest light in 
the Universe). Just as when Galileo turned 
his telescope to the heavens for the first time, 
everything will be new. It is truly a privilege to 
be present at the dawn of gravitational-wave 
astronomy. m 
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Dietary fat promotes 
intestinal dysregulation 


In mice, a high-fat diet has now been found to induce intestinal progenitor cells 
to adopt a more stem-cell-like fate, altering the size of the gut and increasing 


tumour incidence. SEE ARTICLE P.53 


CHI LUO & PERE PUIGSERVER 


xcess food intake causes obesity and is 
Hine to many life-threatening diseases, 
including cardiovascular disease and 
cancer’. Ingested dietary components cause a 
well-understood physiological response that 
readjusts metabolism and restores energy 
balance, and that is controlled by nutrients, 
hormones and systemic factors”. But little is 
known about how dietary components might 
affect stem-cell biology to alter tissue function 
or tumour development’. On page 53 of this 
issue, Beyaz et al.* show that an increase in 
dietary fat directly promotes the proliferation 
of intestinal stem cells (ISCs) and the progeni- 
tor cells to which they give rise, perhaps result- 
ing in more ‘seeds’ that can develop into cancer. 
Tissue homeostasis in the gut requires 
that resident ISCs maintain a dynamic bal- 
ance between self-renewal, which expands 
the stem-cell pool, and differentiation into 
daughters — intestinal progenitors that even- 
tually give rise to all the mature lineages of the 
intestinal epithelium (the intestinal lining). 
Nutrients cause changes in a variety of circu- 
lating factors that influence adult stem-cell 
biology, affecting this balance and so altering 
tissue remodelling and regeneration’. In addi- 
tion, ISCs are in close contact with dietary 
constituents undergoing digestion, and so are 
directly regulated by molecular components 
of ingested food. 
Beyaz etal. report that a high-fat diet (HFD) 
elevates the number and proliferation rate 
of both ISCs and progenitors in mice. This 
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alteration led to elongation and regeneration 
of pits in the epithelium called crypts, in which 
these cell types are located. The HFD aug- 
mented the ability of intestinal crypts to give 
rise to mini-gut-like structures called orga- 
noids when cultured in vitro — an approach 
that is widely used to assay ISC activity. Pro- 
genitors from HFD mice could also form 
organoids, suggesting that they become more 
stem-cell-like under HFD conditions (Fig. 1). 
Rather than being a result of obesity per se, 
these changes in ISC and progenitor behaviour 
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were caused by certain fatty acids in the HED. 
Despite the changes in ISC and progenitor 
function, Beyaz and colleagues found that the 
overall length and weight of the intestine were 
reduced in mice fed an HFD, compared with 
control animals. And, explaining this finding, 
the number of certain mature cell types — 
absorptive cells and Paneth cells, which defend 
against harmful bacteria in the gut — was 
lower in HFD mice. Together, these findings 
imply that an undifferentiated ISC pool is 
maintained despite the expanded numbers and 
increased regenerative potential of the ISCs. 
Stem cells normally reside in a specialized 
environment called a niche, in which commu- 
nication with neighbouring cells ensures their 
precise regulation. Paneth cells are an essen- 
tial part of the niche’, and are interspersed 
throughout it. A previous study® from the 
same group revealed that calorie restriction 
increases the number of niche Paneth cells, 
promoting ISC self-renewal and subsequent 
intestinal regeneration. By contrast, in the cur- 
rent study, the authors demonstrated that an 
unrestricted HED uncoupled the ISCs from 
their niche, allowing them to adjust to the 
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Figure 1 | Dietary fats remodel the intestine. Intestinal stem cells (ISCs) and their daughters, intestinal 
progenitors, reside at the bottom of structures called crypts, and give rise to all the cell types of the mature 
intestine, including Paneth cells and absorptive cells. When mice are fed a balanced diet, signals released 
by Paneth cells (grey arrows) promote ISC regeneration. Beyaz et al.’ report that some of the fatty acids in 
a high-fat diet (HFD) activate a signalling cascade that involves the nuclear receptor protein PPAR-6 and 
the protein B-catenin. The cascade increases ISC and progenitor proliferation, makes progenitors more 
stem-cell-like and enables ISCs to grow in the absence of signals from Paneth cells. This HFD-mediated 
expansion of the stem and progenitor pool promotes tumour formation. 
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resulting volume of space that can be explored 
will be tens of times greater than could be 
seen during the GW150914 detection, and 
will allow the direction of future events to be 
determined much more accurately than was 
possible for GW 150914. 

Surprises undoubtedly await, particularly 
given that the ability to detect gravitational 
waves at new frequency ranges is being devel- 
oped in facilities such as the space-based 
Evolved Laser Interferometer Space Antenna 
and the ground-based International Pulsar 
Timing Array, and for various experiments 
that are studying the polarization of the cosmic 


STEM CELLS 


microwave background (the oldest light in 
the Universe). Just as when Galileo turned 
his telescope to the heavens for the first time, 
everything will be new. It is truly a privilege to 
be present at the dawn of gravitational-wave 
astronomy. m 
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tually give rise to all the mature lineages of the 
intestinal epithelium (the intestinal lining). 
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alteration led to elongation and regeneration 
of pits in the epithelium called crypts, in which 
these cell types are located. The HFD aug- 
mented the ability of intestinal crypts to give 
rise to mini-gut-like structures called orga- 
noids when cultured in vitro — an approach 
that is widely used to assay ISC activity. Pro- 
genitors from HFD mice could also form 
organoids, suggesting that they become more 
stem-cell-like under HFD conditions (Fig. 1). 
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were caused by certain fatty acids in the HED. 
Despite the changes in ISC and progenitor 
function, Beyaz and colleagues found that the 
overall length and weight of the intestine were 
reduced in mice fed an HFD, compared with 
control animals. And, explaining this finding, 
the number of certain mature cell types — 
absorptive cells and Paneth cells, which defend 
against harmful bacteria in the gut — was 
lower in HFD mice. Together, these findings 
imply that an undifferentiated ISC pool is 
maintained despite the expanded numbers and 
increased regenerative potential of the ISCs. 
Stem cells normally reside in a specialized 
environment called a niche, in which commu- 
nication with neighbouring cells ensures their 
precise regulation. Paneth cells are an essen- 
tial part of the niche’, and are interspersed 
throughout it. A previous study® from the 
same group revealed that calorie restriction 
increases the number of niche Paneth cells, 
promoting ISC self-renewal and subsequent 
intestinal regeneration. By contrast, in the cur- 
rent study, the authors demonstrated that an 
unrestricted HED uncoupled the ISCs from 
their niche, allowing them to adjust to the 
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Figure 1 | Dietary fats remodel the intestine. Intestinal stem cells (ISCs) and their daughters, intestinal 
progenitors, reside at the bottom of structures called crypts, and give rise to all the cell types of the mature 
intestine, including Paneth cells and absorptive cells. When mice are fed a balanced diet, signals released 
by Paneth cells (grey arrows) promote ISC regeneration. Beyaz et al.’ report that some of the fatty acids in 
a high-fat diet (HFD) activate a signalling cascade that involves the nuclear receptor protein PPAR-6 and 
the protein B-catenin. The cascade increases ISC and progenitor proliferation, makes progenitors more 
stem-cell-like and enables ISCs to grow in the absence of signals from Paneth cells. This HFD-mediated 
expansion of the stem and progenitor pool promotes tumour formation. 
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decrease in Paneth-cell numbers. For instance, 
several signalling proteins (such as Jag] and 
Jag2) that are normally produced by Paneth 
cells are upregulated in HFD-derived ISCs, 
sustaining niche-independent growth. 

The incidence of human colorectal cancer 
correlates with diet-induced obesity’. Further- 
more, adult stem cells are speculated to be the 
origin of some cancers’. Beyaz et al. showed 
that the increased pool of ISCs and ISC-like 
progenitors induced by an HFD predisposed 
mice to intestinal tumours. By contrast, calorie 
restriction — which also increases ISC num- 
bers® — is associated with reduced tumour 
initiation’. The mechanistic differences under- 
lying altered stem-cell function in each con- 
dition may partially explain this discrepancy. 
Calorie restriction is associated with increased 
interactions between the niche and ISCs, 
whereas HFD-associated, niche-independent 
growth allows stem cells to escape physiologi- 
cal regulation. How the molecular pathways 
modulated by these two dietary regimens 
intersect and communicate in ISCs remains 
to be investigated, and might identify putative 
therapeutic targets. 

Gene-expression profiles often provide clues 
to the state of a cell. Nuclear receptors such as 
PPAR and LXR proteins sense nutrients and 
regulate gene expression, providing a connec- 
tion between diet-induced metabolic changes 
and these profiles’. Beyaz and colleagues ana- 
lysed gene expression and found that genetic 
targets of one PPAR, PPAR-6, were upregu- 
lated in ISCs in HFD mice compared to ISCs in 
control mice. PPAR-6 is linked to a signalling 
cascade called the Wnt--catenin pathway"’, 
which is involved in the development of intes- 
tinal tumours. In the current study, genetic 
and pharmacological experiments revealed 
that activation of PPAR-d and Wnt-f-catenin 
signalling mediates, at least in part, the effects 
of dietary fats on ISC and progenitor function 
and intestinal tumour formation. 

This work provides a plausible cellular and 
molecular explanation for how an excess of 
dietary fat remodels the intestine. However, 
questions remain about how the basic mecha- 
nisms of action of dietary fat affect systemic 
energy metabolism and other gastrointestinal 
diseases apart from cancer. For instance, it is 
known that diet can influence immune and 
metabolic activities by directly modulating 
the diversity and functions of the gut micro- 
biota (the population of microorganisms that 
inhabit the gut), but it remains unclear whether 
and how such changes in the microbiota inte- 
grate with the PPAR-6 pathway. Because the 
microbiota differs between individuals, the 
interplay between microbiota and ISCs under 
HED conditions might modulate an individ- 
ual’s tumour risk. 

It would be interesting to investigate the 
contribution of ISCs to gut inflammatory 
disorders such as Crohn’s disease because, as 
with cancer, an HFD accelerates the progress 


of these disorders independently of obesity". 
Furthermore, it is not known whether an HFD 
affects the gut’s neuroendocrine system (the 
hormone-releasing cells that receive input 
from neurons), perhaps through its effects 
on ISCs and progenitors, to contribute to the 
metabolic alterations associated with obesity, 
type 2 diabetes or cardiovascular diseases. 
The current study does not address whether 
the effects ofan HFD on gut architecture are 
reversible. Moreover, it is unclear how changes 
in dietary regimens affect ISC function. 
Finally, further investigation will be needed to 
determine whether dietary or pharmacologi- 
cal interventions that target ISCs could main- 
tain healthy intestinal function and reduce the 
incidence of tumours or other HFD-associated 
human diseases. Such research, building on 
the foundation provided by the current study, 
will be important for defining future steps in 
personalized human nutrition and health. = 
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beyond the knees 


The development of a radio technique for detecting cosmic rays casts fresh light 
on the origins of some of these accelerated particles, and suggests that they might 
have travelled much farther than was previously thought. SEE LETTER P.70 


ANDREW M. TAYLOR 


technique for measuring the com- 

position of cosmic rays in the energy 

range 10’’ to 10’”° electronvolts — a 
range thought to mark the transition point 
from Galactic to extragalactic cosmic rays 
— is reported by Buitink et al.' on page 70 
of this issue. The findings have implications 
for our understanding of the sources of these 
mysterious rays. 

Energetic protons and atomic nuclei arriving 
at Earth are classified as cosmic rays. The 
spread in energy of these particles is impres- 
sive, covering ten orders of magnitude (Fig. 1). 
The most energetic cosmic rays have energies 
more than 10 million times those achievable 
for protons at the Large Hadron Collider, the 
world’s most powerful particle accelerator at 
CERN in Geneva, Switzerland. This begs the 
question of what cosmic sources can accelerate 
particles to such high energies. 

Clues about the origin of cosmic rays come 
from both their composition and their energy 
spectra. Unfortunately, these particles arrive 
at Earth rather infrequently — only one parti- 
cle with an energy greater than 10” eV arrives 
each day in every 10,000 square metres. This 
scarcity makes it difficult to build detectors 
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able to capture sufficient arrival events to draw 
statistically meaningful conclusions. At pre- 
sent, detector size and observation times are 
limited by the use of optical detectors’, which 
have areas of up to a few square kilometres and 
can operate only on clear, moonless nights. 

The region between about 10’°° eV and 
10'** eV (the second energy value is called 
the ankle; Fig. 1) in the energy distribution 
of cosmic rays demarks an area of change. At 
energies higher than 10'°° eV, the composition 
of cosmic rays changes from lightweight par- 
ticles, such as protons and helium nuclei, to 
increasingly heavy ones, such as the nuclei of 
carbon and heavier elements. These changes 
are thought to be associated with the maxi- 
mum energies up to which Galactic sources — 
possibly supernova remnants (SNRs) — can 
accelerate cosmic rays’. The maximum energy 
for Galactic protons (the proton knee) is about 
10'*° eV, whereas that for iron nuclei (the iron 
knee) is about 10’ eV. The ankle is thought to 
be dominated by particles from extragalactic 
sources*”, for example active galactic nuclei, 
y-ray bursts or neutron stars. But, within this 
picture, the origin of cosmic rays between the 
iron knee and the ankle is unclear. 

Buitink et al. investigated the cosmic-ray 
composition in this energy region using a new 
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sent, detector size and observation times are 
limited by the use of optical detectors’, which 
have areas of up to a few square kilometres and 
can operate only on clear, moonless nights. 
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10'** eV (the second energy value is called 
the ankle; Fig. 1) in the energy distribution 
of cosmic rays demarks an area of change. At 
energies higher than 10'°° eV, the composition 
of cosmic rays changes from lightweight par- 
ticles, such as protons and helium nuclei, to 
increasingly heavy ones, such as the nuclei of 
carbon and heavier elements. These changes 
are thought to be associated with the maxi- 
mum energies up to which Galactic sources — 
possibly supernova remnants (SNRs) — can 
accelerate cosmic rays’. The maximum energy 
for Galactic protons (the proton knee) is about 
10'*° eV, whereas that for iron nuclei (the iron 
knee) is about 10’ eV. The ankle is thought to 
be dominated by particles from extragalactic 
sources*”, for example active galactic nuclei, 
y-ray bursts or neutron stars. But, within this 
picture, the origin of cosmic rays between the 
iron knee and the ankle is unclear. 

Buitink et al. investigated the cosmic-ray 
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Figure 1 | The energy spectrum of cosmic rays. When the number of cosmic rays per square metre that 
arrive at Earth’s surface per day is plotted on a logarithmic scale against the energies of the cosmic rays, 
also ona logarithmic scale, the overall graph (yellow line) between 10" and 10” electronvolts contains 
four distinct linear regions, connected by changes in slope called the proton knee, the iron knee and the 
ankle. The slope changes correspond to transitions between the composition or sources of cosmic rays 
that dominate at different energies. Buitink et al.' report on observations of cosmic rays between the iron 
knee and the ankle, providing clues about their origins. 


radio-based instrument. This can detect more 
cosmic rays in a given period than can optical 
detectors of equivalent effective area, because 
it can operate both day and night. The authors’ 
detection method relies on descriptions” of 
the radio signals generated by air showers — 
the cascades of energetic charged particles and 
electromagnetic radiation produced when 
cosmic rays enter the atmosphere. The radio 
signal is produced by the interaction of the 
charged particles with Earth’s magnetic field, 
and by the development of a charge imbalance 
within the shower through a phenomenon 
called the Askarayan effect®; the contributions 
of these two effects to the signal are of compa- 
rable magnitude. 

This knowledge allowed Buitink and col- 
leagues to probe the air-shower profile pro- 
duced by cosmic rays in the energy range 
of interest. Air showers involve a chain of 
processes that rapidly increase the number 
of energetic particles (known as secondary 
particles) within them. The rate of transfer 
of a cosmic ray’s energy to secondary parti- 
cles depends on its nuclear composition. The 
authors could discriminate between different 
species of cosmic ray — that is, the type of par- 
ticle that was originally accelerated — from the 
depth in the atmosphere at which the maxi- 
mum number of air-shower particles occurred. 

From a sample of 118 such profile meas- 
urements, the researchers concluded that the 
fraction of cosmic rays consisting of protons 
and helium nuclei in the energy band 10” to 
10'”° eV is between 38% and 98%, with a 99% 
confidence level; the best-fit value for this 
light-mass fraction is about 80%. These results 
are consistent with previous results attained 
using optical techniques””, but this is the first 
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time that such a composition measurement has 
been made using a radio instrument. 

The success of Buitink and co-workers 
method provides fresh clues about the ori- 
gin of cosmic rays between the iron knee and 
the ankle. It could be that cosmic rays in this 
energy range are associated with an extragalac- 
tic component, which would challenge the idea 


REGENERATION 


that the ankle represents the onset of this com- 
ponent in the energy distribution. Alterna- 
tively, these cosmic rays might have a Galactic 
origin. This would indicate the existence of a 
second population of Galactic sources capable 
of accelerating particles to considerably higher 
energies than those achievable by SNRs. Either 
way, the authors’ findings support the idea of 
light-mass cosmic rays in the knee-to-ankle 
region that must now be explained. 

The powerful method used to make this 
measurement might also seed a new era of 
cosmic-ray science, in which previously 
unknown spectral features and changes in com- 
position at energies within the transition zone 
are probed with unprecedented detail. Perhaps 
in this way we will uncover further surprises 
and evidence of extragalactic sources. m 
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Stem cells make the 
bowel nervous 


In Hirschsprung disease, the enteric nervous system (ENS) is missing from the 
distal bowel. It emerges that postnatal transplantation of stem-cell-derived ENS 
precursors can prevent death in a mouse model of the disease. SEE LETTER P.105 


ROBERT 0. HEUCKEROTH 


he enteric nervous system (ENS) is a 

network of neurons and supporting glial 

cells in the bowel wall that is essential 
for digestion’. When ENS precursor cells fail 
to migrate through the full length of the bowel 
during the first trimester of pregnancy, a life- 
threatening birth defect called Hirschsprung 
disease ensues — a prominent symptom of 
which is constant contraction of the affected 
bowel regions**. The standard treatment for 
children with Hirschsprung disease is removal 
of the abnormal bowel, but many children con- 
tinue to have bowel problems after surgery*. On 
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page 105 of this issue, Fattahi et al.° describe 
a method for generating ENS precursors from 
stem cells. Remarkably, transplantation of these 
cells into an animal model of Hirschsprung dis- 
ease prevented premature death. 

The ENS contains about as many neurons 
as the spinal cord, and its diversity of neu- 
ronal subtypes rivals that of the brain®. This 
complexity allows the ENS to recognize 
sensory input from both the bowel wall and 
within the bowel, and to produce integrated 
bowel motility patterns that facilitate nutri- 
ent absorption’. The ENS also influences 
bowel inflammatory cells, blood vessels, 
smooth muscle, intestinal pacemakers and the 
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Figure 1 | The energy spectrum of cosmic rays. When the number of cosmic rays per square metre that 
arrive at Earth’s surface per day is plotted on a logarithmic scale against the energies of the cosmic rays, 
also ona logarithmic scale, the overall graph (yellow line) between 10" and 10” electronvolts contains 
four distinct linear regions, connected by changes in slope called the proton knee, the iron knee and the 
ankle. The slope changes correspond to transitions between the composition or sources of cosmic rays 
that dominate at different energies. Buitink et al.' report on observations of cosmic rays between the iron 
knee and the ankle, providing clues about their origins. 


radio-based instrument. This can detect more 
cosmic rays in a given period than can optical 
detectors of equivalent effective area, because 
it can operate both day and night. The authors’ 
detection method relies on descriptions” of 
the radio signals generated by air showers — 
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signal is produced by the interaction of the 
charged particles with Earth’s magnetic field, 
and by the development of a charge imbalance 
within the shower through a phenomenon 
called the Askarayan effect®; the contributions 
of these two effects to the signal are of compa- 
rable magnitude. 

This knowledge allowed Buitink and col- 
leagues to probe the air-shower profile pro- 
duced by cosmic rays in the energy range 
of interest. Air showers involve a chain of 
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of energetic particles (known as secondary 
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light-mass fraction is about 80%. These results 
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page 105 of this issue, Fattahi et al.° describe 
a method for generating ENS precursors from 
stem cells. Remarkably, transplantation of these 
cells into an animal model of Hirschsprung dis- 
ease prevented premature death. 

The ENS contains about as many neurons 
as the spinal cord, and its diversity of neu- 
ronal subtypes rivals that of the brain®. This 
complexity allows the ENS to recognize 
sensory input from both the bowel wall and 
within the bowel, and to produce integrated 
bowel motility patterns that facilitate nutri- 
ent absorption’. The ENS also influences 
bowel inflammatory cells, blood vessels, 
smooth muscle, intestinal pacemakers and the 


epithelial cells that line the bowel. 

It is therefore not surprising that children 
with Hirschsprung disease develop abdomi- 
nal distension, vomiting and constipation, 
fail to grow normally and can die from sepsis 
(a bacterial infection of the bloodstream). At 
least one-third’ of children with Hirschsprung 
disease continue to have serious problems after 
surgery, including a life-threatening syndrome 
called enterocolitis. Furthermore, some chil- 
dren with this disease have so little bowel that 
is innervated by the ENS that they require 
intravenous nutrient delivery to survive. 

Exciting work® suggests that regenerative 
medicine could one day offer an alternative 
to surgery for treating Hirschsprung disease. 
In this approach, stem cells would be trans- 
planted into and would restore function in 
bowel regions in which the ENS is missing. 
Ideally, the transplanted cells would come 
from the affected child (known as autologous 
transplantation) to avoid immune rejection, 
and non-surgical methods would be used for 
cell delivery. 

Of particular interest for this type of therapy 
are gut-derived ENS stem cells, which can be 
isolated from the human bowel at all ages and 
cultured in vitro. Following culture, these 
stem cells can be reimplanted in the bowel 
wall. They then migrate to the normal site of 
the ENS and differentiate into neurons and 
glial cells that mimic those of the native ENS’. 
However, this therapeutic approach faces sev- 
eral challenges, including difficulty producing 
enough gut-derived cells, limited cell migra- 
tion, limited data about long-term safety and 
minimal information about the ability of these 
cells to restore gut function. 

Fattahi et al. address some of these problems 
using human embryonic stem (ES) cells, which 
are derived from early embryos and can give 
rise to every cell type in the body. To direct 
differentiation of human ES cells towards an 
ENS precursor lineage, the authors modulated 
signalling pathways that control development 
by inhibiting SMAD and glycogen synthase 
kinase proteins, and then treated the cells with 
the metabolite retinoic acid. Under these con- 
ditions, human ES cells differentiated into cells 
that resemble ENS precursors from the vagal 
region of the developing spinal cord (called the 
vagal neural crest)'". The authors refer to these 
cells as enteric neural-crest (ENC) precursors. 

These ENC precursors shared several key 
features with ENS precursors. For instance, 
when transplanted into the vagal neural-crest 
regions of developing chick embryos, ENC pre- 
cursors often migrated to the bowel, like nor- 
mal ENS precursors. When transplanted to the 
colon of young mice, ENC precursors popu- 
lated the bowel close to the location of the nor- 
mal ENS, but migrated even more quickly than 
fetal ENS-derived cells. When grown alongside 
human ES-cell-derived smooth-muscle cells, 
ENC precursors enhanced muscle differen- 
tiation and became neurons that could induce 
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Figure 1 | Help for a model of Hirschsprung 
disease. Hirschsprung disease is a birth defect in 
which the enteric nervous system (ENS) fails to 
colonize the full length of the bowel during early 
pregnancy. To investigate possible therapies for 
the disease, Fattahi et al.* grew human embryonic 
stem (ES) cells (which can give rise to every 

cell type of the body) in vitro under conditions 
that encouraged them to differentiate into 

cells resembling ENS precursors. The authors 
transplanted these cultured cells into the colons 
of mice with a genetic mutation that causes 

a Hirschsprung-like disease. The transplant 
prevented premature death in these mice, although 
how the cells achieved this feat is not clear. 


muscle contraction when activated. Follow- 
ing an extended period of in vitro culture with 
vitamin C and the growth factor GDNF, ENC 
precursors produced diverse neuronal and glial 
cells similar to those of the ENS. Most impres- 
sively, when ENC precursors were transplanted 
into the colons of mice with Hirschsprung-like 
disease, survival rates improved dramatically 
over a short time interval (Fig. 1). 

Finally, Fattahi et al. used ENC precursors 
harbouring a genetic mutation that predis- 
poses humans to Hirschsprung disease to per- 
form an in vitro drug screen, and discovered 
that inhibiting the protease enzyme BACE2 
enhanced ENC-precursor migration. The gene 
that encodes BACE2 is located in a region of 
chromosome 21 whose duplication increases 
the risk of Hirschsprung disease. This find- 
ing may be relevant to Down syndrome, in 
which children are born with three copies of 
chromosome 21 and rates of Hirschsprung 
disease are increased by as much as 100-fold’. 

This study establishes a potentially limitless 
source of cells similar to those of the vagal neu- 
ral crest that could be tested for use in trans- 
plants to treat children with Hirschsprung 
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disease or other disorders in which the ENS is 
defective. In an ideal therapy, ENC precursors 
would be produced from ‘induced pluripotent’ 
stem cells, which closely resemble human 
ES cells'’, but can be derived from the skin or 
blood cells of affected children, removing the 
need for embryo-derived cells and post-trans- 
plant immunosuppression. Fattahi and col- 
leagues provide preliminary data to suggest that 
this strategy will work well. Furthermore, the 
human ES-cell-derived ENC precursors they 
produced migrate efficiently through the bowel 
and could potentially be delivered through an 
endoscope, avoiding invasive surgery. 

Although these advances are exciting, many 
questions remain. In particular, it is unlikely 
that transplanted ENC precursors recreated a 
normal ENS in the Hirschsprung model mice, 
given the rapidity with which transplanta- 
tion rescued lethal bowel disease. Instead, 
minimally organized ENC precursors might 
have modulated immune activity or enhanced 
epithelial-cell function and repair by releasing 
neurotransmitter molecules (or other factors). 
Identifying these ENC-precursor-derived fac- 
tors might lead to the development of other 
treatment or prevention strategies that obvi- 
ate the need for cell-based therapies. Similarly, 
BACE2 targets that influence ENC-precursor 
migration could be used to enhance stem-cell 
therapy or to prevent Hirschsprung disease. 

Finally, the effect of transplanted ENC pre- 
cursors on bowel motility and long-term safety 
needs to be addressed. Nonetheless, Fattahi 
and colleagues’ study moves us one step closer 
to a time when autologous stem-cell therapy 
could replace surgery as a primary treatment 
for children with Hirschsprung disease. m 
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Genomic analyses identify molecular 
subtypes of pancreatic cancer 
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Integrated genomic analysis of 456 pancreatic ductal adenocarcinomas identified 32 recurrently mutated genes that 
aggregate into 10 pathways: KRAS, TGF-3, WNT, NOTCH, ROBO/SLIT signalling, G1/S transition, SWI-SNF, chromatin 
modification, DNA repair and RNA processing. Expression analysis defined 4 subtypes: (1) squamous; (2) pancreatic 
progenitor; (3) immunogenic; and (4) aberrantly differentiated endocrine exocrine (ADEX) that correlate with 
histopathological characteristics. Squamous tumours are enriched for TP53 and KDM6A mutations, upregulation of the 
TP63 AN transcriptional network, hypermethylation of pancreatic endodermal cell-fate determining genes and have a poor 
prognosis. Pancreatic progenitor tumours preferentially express genes involved in early pancreatic development (FOXA2/3, 
PDX1 and MNX1). ADEX tumours displayed upregulation of genes that regulate networks involved in KRAS activation, 
exocrine (NR5A2and RBPJL), and endocrine differentiation (NEUROD1 and NKX2-2). Immunogenic tumours contained 
upregulated immune networks including pathways involved in acquired immune suppression. These data infer differences 
in the molecular evolution of pancreatic cancer subtypes and identify opportunities for therapeutic development. 
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Pancreatic cancer (PC) is the fourth leading cause of cancer death in 
Western societies, and projected to be the second within a decade’. It has 
a median survival measured in months and a five-year survival of <5%. 
Advances in therapy have only achieved incremental improvements in 
overall outcome, but can provide notable benefit for undefined subgroups 
of patients. As a consequence, there is an urgent need to better under- 
stand the molecular pathology of PC in order to improve patient selection 
for current treatment options, and to develop novel therapeutic strategies. 

Genomic analyses of pancreatic cancer reveal a complex mutational 
landscape with four common oncogenic events in well-known 
cancer genes (KRAS, TP53, SMAD4 and CDKN2A), amongst a milieu 
of genes mutated at low prevalence. Despite this heterogeneity, onco- 
genic point mutations of individual genes aggregate into core molecular 
pathways including DNA damage repair, cell cycle regulation, TGF-( 
signalling, chromatin regulation and axonal guidance». Increasingly 
sophisticated analyses are revealing biologically important events with 
clinical significance, including whole-genome sequencing, which 
sub-classifies PC into 4 subtypes based on the frequency and distri- 
bution of structural variation. Those termed unstable due to a large 
number of structural variants correlate with defects in DNA mainte- 
nance and therapeutic responsiveness to platinum based therapies”. 
Aberrations in other features that characterize cancer genomes, includ- 
ing mutational signatures®, and differential methylation’ are providing 
deeper insights into disease pathophysiology. 

Here we performed a comprehensive integrated genomic analysis 
of 456 PCs and their histopathological variants using a combination 
of whole-genome and deep-exome sequencing, with gene copy num- 
ber analysis to determine the mutational mechanisms and candidate 
genomic events important in pancreatic carcinogenesis. RNA expres- 
sion profiles were used to define four subtypes and the different tran- 
scriptional networks that underpin them. These subtypes are associated 
with distinct histopathological characteristics and differential survival. 
Genomic and epigenetic features that characterize each subtype infer 
different mechanisms of molecular evolution. 


Mutational landscape of PC 

Study participants were recruited and consent for genomic sequencing 
obtained through the Australian Pancreatic Cancer Genome Initiative 
(APGI; http://www.pancreaticcancer.net.au) as part of the International 
Cancer Genome Consortium (ICGC; http://www.icgc.org). The 382 
APGI group consisted of participants with primarily treatment-naive 
resected PC, which were pancreatic ductal adenocarcinoma (PDAC) 
and its variants (adenosquamous, colloid, PDAC associated with intra- 
ductal papillary mucinous neoplasm (IPMN)) and a small number of 
rare acinar cell carcinomas (Supplementary Table 1). We detected 
23,538 high confidence coding mutations”®”, of which, 7,377 were 
verified using orthogonal approaches (Supplementary Tables 1, 2 and 
19). A total of 21,208 high confidence genomic rearrangements were 
also identified (Supplementary Tables 3 and 4)”*. To maximize the 
power to define coding driver mutations, 74 previously published 
PC exomes*~? were included to yield a final cohort of 456 tumours. 
OncodriverFM detected 32 significantly mutated genes (false discov- 
ery rate (FDR) < 0.1), 22 of which were also identified by MutsigCV2 
(Q<0.1) and/or were supported by HOTNET2 analysis (Methods and 
Supplementary Table 5). These significantly mutated genes aggregated 
into 10 molecular mechanisms (Extended Data Fig. 1): with activating 
mutations of KRAS in 92%; disruption of G1/S checkpoint machinery 
(TP53, CDKN2A and TP53BP2) in 78%; TGF-G signalling (SMAD4, 
SMAD3, TGFBR1, TFGBR2, ACVR1B and ACVR2A) in 47%; histone 
modification (KDM6A, SETD2 and ASCOM complex members MLL2 
and MLL3) in 24%; the SWI/SNF complex (ARID1A, PBRM1 and 
SMARCA4) in 14%; the BRCA pathway (BRCA1, BRCA2, ATM and 
PALB2: 5% germline, 12% somatic); WNT signalling defects through 
RNF43 mutation (5%); and RNA processing genes, SF3B1, U2AF1 and 
RBM10 (16%). RBM10 is implicated in lung cancer'®, where inactivating 
mutations influence expression of oncogenic isoforms of NUMB!"!. 
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SF3B1 mutations in PC were aggregated at the K700E mutation hot- 
spot common in myelodysplastic syndrome, breast and lung cancer’? 
and presents a potential therapeutic target'’. Mutations in other genes 
encoding splicing machinery: SF3A1, U2AF2, SF1 and RBM6 were also 
identified (Extended Data Fig. 2 and Supplementary Table 6). 
GISTIC2 identified 50 regions of recurrent gain (43 focal, 7 chromo- 
somal arms) and 73 regions of loss (61 focal, 12 chromosomal arms) 
(Supplementary Tables 7-9). These regions included known oncogenes 
MET, NOTCH] and GATA6 and tumour suppressor genes CDKN2A, 
SMAD4, TP53, BRCA1, ARIDIA, PBRM1 and SMARCA4. Integrating 
copy number and expression data identified a number of genes/ampli- 
cons implicated in the progression of other cancer types that exhibited 
concordant gene expression changes (Supplementary Table 10). These 
included: amplification of MIB1, a known mediator of NOTCH sig- 
nalling and pancreas development'4 and the CCNE1-URI1 amplicon at 
19q12 (Extended Data Fig. 2b). CCNE1 is a marker of poor prognosis 
in ovarian, breast and lung cancers and is associated with resistance 
to platinum based therapy’®. Recent small interfering RNA (siRNA) 
screening of PC cell lines provides supportive evidence for CCNE1 
amplification as an important mechanism in pancreatic carcinogenesis, 
and may represent a therapeutic opportunity using CDK inhibitors'®. 
DNA deamination, ectopic APOBEC activity, BRCA-deficiency and 
mismatch repair were re-affirmed as the predominant mutational mech- 
anisms in PC. Chromothriptic and break-fusion-bridge related genomic 
catastrophes were uncommon (12%; Supplementary Table 11). Somatic 
LINE-1 retro-transposition of known HotL1 elements was present 
in 35% of patients!” (Supplementary Table 12). As only one of these 
events directly affected a known cancer gene (insertion into ROBO2), 
it appears unlikely that this is a major mutational mechanism in PC. 
No recurrent fusion events were detected (Supplementary Table 13). 


Transcriptional networks and subtypes of PC 

We used bulk tumour tissue to better understand the transcriptional 
networks and molecular mechanisms that underpin the tumour 
microenvironment. Initial unsupervised clustering of RNA-seq 
data for 96 tumours with high epithelial content (>40%) to balance 
stromal gene expression resolved four stable classes (Fig. 1a and 
Extended Data Fig. 3). These four subtypes were also present in the 
extended set of 232 PCs using array-based mRNA expression profiles 
encompassing the full range of tumour cellularity (from 12-100%) 
(Extended Data Fig. 4). We named these subtypes: (1) squamous; 
(2) pancreatic progenitor; (3) immunogenic; and (4) aberrantly dif- 
ferentiated endocrine exocrine (ADEX) on the basis of the differential 
expression of transcription factors and downstream targets important 
in lineage specification and differentiation during pancreas develop- 
ment and regeneration. Transcriptional network analysis identified 
26 coordinately expressed gene programmes representing distinct 
biological processes, 10 of which discriminated the 4 PC classes 
(Fig. 1b, Extended Data Fig. 5 and Supplementary Tables 14-16). These 
4 subtypes were associated with specific histological characteristics: 
(1) squamous with adenosquamous carcinomas (6/25 in squamous 
versus 1/71 in the rest, P=0.0011 Fisher's exact test); (2) pancreatic 
progenitor and (3) immunogenic with mucinous non-cystic (colloid) 
adenocarcinomas and carcinomas arising from IPMN, which are 
mucinous (P= 0.0005); and (4) ADEX with rare acinar cell carcinomas 
(although numbers were small, both cases clustered with the ADEX 
class) (Fig. 1a). Squamous subtype was an independent poor prognostic 
factor (Fig. 1c and Supplementary Table 21). 


Squamous subtype 

Four core gene programmes characterized squamous tumours 
(Fig. 1b), which included gene networks involved in inflammation, 
hypoxia response, metabolic reprogramming, TGF-§ signalling, 
MYC pathway activation, autophagy and upregulated expression of 
TP63AN and its target genes. Many of these genes are highly expressed 
in the C2-squamous-like class of tumours of breast, bladder, lung and 
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Figure 1 | Molecular classes and transcriptional networks defining 
PDAC. a, Unsupervised analysis of RNA-seq identified 4 PDAC classes: 
squamous (blue); ADEX (abnormally differentiated endocrine exocrine; 
brown); pancreatic progenitor (yellow); and immunogenic (red). 

*P < 0.05, Fisher's exact test. b, Heatmap of gene programmes significantly 
enriched in PDAC. Black dot denotes transcriptional networks showing 
highest significance for an individual class. c, Kaplan-Meier analysis of 
patient survival stratified by class. 


head and neck cancer defined in the Cancer Genome Atlas (TCGA) 
pan-cancer studies!®, which was the reason we termed them squamous 
(Fig. 2a). As in these other cancer types, the pancreatic squamous sub- 
type was associated with mutations in [P53 (P=0.01) and KDM6A 
(P=0.02), which interacts with ASCOM complex constituents MLL2 
and MLL3 (Figs 1a and 2b). Although previous immunohistochemical 
studies have identified increased T'P63 expression in adenosquamous 
pancreatic tumours!’, RNA-seq identified high TP63AN expression 
and its target genes as a key feature (Fig. 2c). TP63AN, in the presence 
of TP53 mutation, is known to regulate epithelial cell plasticity, tum- 
origenicity and epithelial to mesenchymal transition in a variety of 
solid tumours”°. Squamous tumours were enriched for activated «681 
and «684 integrin signalling, and activated EGF signalling, (Extended 
Data Fig. 6 and Supplementary Table 16). The squamous subtype is 
associated with hypermethylation and concordant downregulation of 
genes that govern pancreatic endodermal cell-fate determination (for 
example, PDX 1, MNX1, GATA6, HNF1B) leading to a complete loss of 
endodermal identity (Fig. 2d, e and Supplementary Table 17). 


Pancreatic progenitor subtype 

Transcriptional networks containing transcription factors PDX1, 
MNX1, HNF4G, HNF4A, HNF1B, HNF1A, FOXA2, FOXA3 and HES1 
primarily define the pancreatic progenitor class (Extended Data Fig. 7). 
These transcription factors are pivotal for pancreatic endoderm cell-fate 
determination towards a pancreatic lineage and are linked to maturity 
onset diabetes of the young (MODY). PDX1, in particular, is critical 
for pancreas development with ductal, exocrine and endocrine cells all 
derived from embryonic progenitor cells that express PDX1 (ref. 21). 
Gene programmes regulating fatty acid oxidation, steroid hormone 
biosynthesis, drug metabolism and O-linked glycosylation of mucins 
also define pancreatic progenitor tumours. Importantly, apomucins 
MUCS5AC and MUCI, but not MUC2 or MUC6, are preferentially 
co-expressed in pancreatic progenitor tumours. The expression of 
these apomucins defines the pancreatobiliary subtype of IPMN and 
is consistent with PDAC-associated IPMN clustering within this class 
(Supplementary Tables 14-16). TGFBR2 inactivating mutations were 
also enriched in this subtype (P= 0.029). 


ADEX subtype 

The ADEX class is defined by transcriptional networks that are impor- 
tant in later stages of pancreatic development and differentiation, and 
is a subclass of pancreatic progenitor tumours. Transcriptional net- 
works that characterize both exocrine and endocrine lineages at later 
stages are upregulated, rather than one or the other as is the case in 
normal pancreas development. The key networks identified include 
upregulation of: (i) transcription factors NR5A2, MISTI (also known as 
BHLHAISA) and RBPJL and their downstream targets that are impor- 
tant in acinar cell differentiation and pancreatitis/ regeneration”; 
and (ii) genes associated with endocrine differentiation and MODY 
(including INS, NEUROD1, NKX2-2 and MAFA (Extended Data 
Fig. 8 and Supplementary Table 16)). Importantly, several patient- 
derived pancreatic cancer cell lines were enriched with gene pro- 
grammes associated with the ADEX class. Moreover, these cell lines 
expressed multiple genes associated with terminally differentiated 
pancreatic tissues, including AMY2B, PRSS1, PRSS3, CEL and INS. In 
addition, the methylation pattern of ADEX tumours was distinct from 
normal pancreas and clustered with other PCs (Extended Data Fig. 9). 


3 MARCH 2016 | VOL 531 | NATURE | 49 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Po? 
a SE ob 
wv Coverage = 23% (105/456 samples) 
KDM6A (39) EUEREEEUIT ESTE TES ERRES E01 
0 MLLS (37) TEETER ELE AAT 
32 05 MLL2 (20) [fl il 
€ 8 : PPPE6R3 (9) I Ta 
G 
s g 0 KDM6A Squamous (P = 0.02) 
g = MLL3 MLL2 Mutation types 
29 _o5 PPP6R3 EDeletion — EB Non-silent SNV or indel 
ESV BP Amplification (copy number 28) 
© d Mvalue ©  @ATA6 cg14880184 
TAp63TP63AN ae ak 
Sf) ABs He 
-3 c 5 | A. - 
a 4y | ‘ 
o 3 . 
a4 Cor = ~0.85 
21 | Pig < 0.05 
Al sé] A CT 4 3 + 
a or 
oo | 4 
AlSPAISP sag 
-values 


Figure 2 | Molecular characterization of the squamous class. a, Boxplot 
of PDAC squamous class signature scores generated using pan-cancer 

12 expression data and stratified by class. b, Mutual exclusivity plot of a 
mutated gene sub-network identified by HotNet2. c, Boxplot of TAp63 and 
TP63AN expression levels stratified by class. d, Heatmap of differentially 


Immunogenic subtype 

The immunogenic class shares many of the characteristics of the pan- 
creatic progenitor class, but is associated with evidence of a significant 
immune infiltrate. Associated immune gene programmes included 
B cell signalling pathways, antigen presentation, CD4* T cell, CD8* T 
cell and Toll-like receptor signalling pathways (Extended Data Fig. 10 
and Supplementary Table 16). Enrichment analysis identified upreg- 
ulated expression of genes associated with nine different immune 
cell types and/or phenotypes” (Fig. 3a). The predominant expres- 
sion profiles were those related to infiltrating B and T cells, with both 


methylated genes. e, Hypermethylation of GATA6 is associated with 

the concordant down regulation of GATA6 gene expression. Pearson 
correlation and adjusted P values are as indicated. In a and c the boxplots 
are annotated by a Kruskall-Wallis P value. 


cytotoxic (CD8*) and regulatory T cells (CD4*CD25*FOXP3* Tyeg.). 
Upregulation of CTLA4 and PD1 acquired tumour immune suppres- 
sion pathways in the immunogenic subtype inferred therapeutic oppor- 
tunities with novel immune modulators (Fig. 3c). 


Immune mechanisms in pancreatic cancer 

To better define candidate molecular mechanisms active in the tumour 
microenvironment, we correlated enrichment of expression patterns 
that characterize specific immune cell populations with each gene pro- 
gramme (Fig. 3a and Supplementary Tables 15, 16 and 18). Ofall gene 


Re 
SF a Ss oO 
xs i ROS 
a POSS, Fw Bb 
S SES SS eS 
¢ O° DS Lo” 6 
A" @ PPMP LS 
x XS wy x FOE 
& PO @ * Come Y x 
LPO VAT DH WKT SS 
1 Sisazune GP6 GP7 GP8 
ADEX 9 @ 04) p=67x10"']| P=0.00625 P = 0.00945 
1 score S 
Immunogenic = 
S 02 ¢ 
Squamous ac 
ray 
Pancreatic 3 0 ee a 
progenitor a 
= 6 -0.2 ] 
GP6 0 Correlation 4 
=1 
GP7 97 10.4 15.3 17.8 20.8 12 
GP8 25.7 15.2 15.4 20.8/10 19.5 PW SA 
ce d Macrophage signature e T cell co-inhibition signature 
REACTOME_PD1_SIGNALING BIOCARTA_CTLA4_PATHWAY P=0.0042 
1.0 P=0.0022 1.0 j . 
P= 0.00843 P=0.01 3 Median survival 35.8 vs he Median survival 34.3 vs 
s 208 16.5months 208 15.9 months 
6 0. sg tow n=93 sg Low n=93 
3 ” 0.6 % 0.6 
oO o 
g Z 2 
Fe E E 0.4 High E 0.4 
5 0.2 0.2 
aD 
8-05 fs) 8 
(0) (0) 
0 10 20 30 40 50 0 10 20 30 40 50 
A u s P A | 8 P At risk Time (months) At risk Time (months) 
Low 31 28 «18 9 1 1 Low 33 27 «20 12 5 1 
High 62 38 20 an 3 0 High 60 39 18 7 3 0 


Figure 3 | Immune pathways in PDAC. a, Heatmap showing enrichment 
of immune cell/phenotype gene signatures in PDAC (top panel). Heatmap 
showing correlation of immune cell/phenotype gene signatures with the 
identified PDAC GPs (bottom panel). Numbers in cells represent —log)o of 
correlation significance. b, Boxplot of GP module eigengene (ME) scores 
(a measure of sample gene programme relatedness) stratified by class and 
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showing GP class associations. c, Boxplot of PD1 (also known as PDCD1) 
and CTLA4 gene signature scores stratified by class. d, e, Kaplan-Meier 
analysis comparing survival of patients having either high or low immune 
cell/phenotype signature scores. In b and c, the boxplots are annotated by a 
Kruskall-Wallis P value. 
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Figure 4 | Gain of function TP53 mutations and loss of TAp63 

regulate key GPs associated with the squamous class. a, Significant GP 
enrichment of genes deregulated in KPC-mouse-derived cell lines treated 
with Trp53 specific short hairpin RNAs (shRNAs). b, Trp53 regulated 
genes enriched in either GP 2, 3 or 7. c, Sub-network of genes differentially 
expressed between KRAS Trp53!”* and KRAS Trp53"* Trp63" cell lines. 


programmes (GP), GP6, GP7 and GP8 were enriched with immune cell 
specific gene expression signatures (Fig. 3b). Specifically, GP6 and GP8 
were associated with B cell and CD8* T cell signatures, respectively, 
with GP8 associated with the T cell co-inhibitory phenotype (Extended 
Data Fig. 10). GP7 was associated with both the macrophage signa- 
ture and T-cell co-inhibition, which co-segregated with poor survival 
(Fig. 3d, e). Importantly, pathway analysis of GP7, also showed enrich- 
ment for antigen processing and presentation, and Toll-like receptor 
cascade(s) including high expression of TLR4, TLR7, TLR8, PDCD1LG2 
(PD-L2) and CSFIR. The latter are known mediators of tumour associ- 
ated macrophage immunosuppression and inflammation. 


TP53 and TP63 modulation of squamous PDAC 

Based on the association of [P53 mutation and upregulated TP63 
expression in the squamous subtype, we used cell lines derived from 
genetically engineered mouse models of pancreatic cancer (Kras@!7"”*; 
Trp53” ws TApo3 KPC mice) to begin to unravel the functional con- 
sequences of these events in defining squamous tumours. Mice with 
mutations in the DNA binding domain as compared to TP53-null ani- 
mals have more aggressive disease with increased metastatic potential, 
primarily mediated through platelet-derived growth factor receptor 8 
(PDGFRB)”». Analyses of transcriptome data from previous mutant 
TP53 knockdown experiments from ref. 25 showed that mutant TP53 
regulates the expression of transcriptional networks associated with 
the squamous subtype, particularly GPs 2 and 3, including PDGFRB 
(Fig. 4a, b; Supplementary Table 20). Kras@2)/*, Trps3 t; TApo3 
mice have more aggressive metastatic pancreatic cancer than their 
Kras@2i, Tepsatt™ counterparts and also show deregulation of GPs 
2 and 3, inferring that TAp63 plays an important role in squamous PC 
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(Fig. 4c-e). Transcriptional network analysis identified additional key 
factors involved in metastasis that were upregulated in the squamous 
subtype for example, LOX”*. 


Transcriptomic classification of PDAC 

We compared our transcriptome classification with those of 2 pre- 
viously published studies that had either physically’’ or virtually”® 
micro-dissected tumour epithelium to define PC subtypes (Fig. la and 
Extended Data Fig. 9). Using their classifiers to subtype our data, 3 of 
the classes we defined directly overlap with the Collisson classifica- 
tion, with the exception of the novel immunogenic subtype. We altered 
Collisson’s nomenclature to better reflect the insights into the molec- 
ular pathology and candidate mechanisms that our integrated analysis 
generated. The Collisson ‘quasimesenchymal’ subtype was renamed 
‘squamous’ to reflect the molecular characteristics of squamous 
tumours across multiple tissue types, as defined by the TCGA pan- 
cancer analysis. ‘Classical’ was termed ‘pancreatic progenitor’ based 
on the prominence of transcriptional networks vital for early pancreas 
development, and the predominant discriminator from the squamous 
subtype. The Collisson ‘exocrine-like also contained transcriptional 
networks characteristic of committed endocrine differentiation and as 
a consequence were renamed ADEX. Although approximately 50% of 
squamous subtype tumours fell within the ‘basal’ subgroup of Moffitt 
et al.*®, the remainder were composed of a mixture of other Bailey/ 
Collisson subtypes. 

More sophisticated analyses using larger numbers of tumours 
continues to reveal novel insights into pancreatic cancer pathophys- 
iology. In particular, integrated analysis of genomic, epigenomic and 
transcriptomic characteristics is generating biological insights with 
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potential therapeutic relevance. The increased appreciation of the role 
of the immune system in cancer development and progression has 
led to new classes of therapeutics that specifically target mechanisms 
through which the tumour evades immune destruction. Therapeutics 
that target some of these mechanisms are currently in clinical trials in 
many cancer types, including pancreatic cancer. Early clinical trial data 
suggest that, similar to most targeted therapies, patient selection will 
also be important for drugs that target the immune system. The novel 
immunogenic subtype of pancreatic cancer is characterized by specific 
mechanisms that can potentially be targeted using immune modulators, 
and testing in clinical trials is encouraged. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Human research ethical approvals. APGI: Sydney South West Area Health 
Service Human Research Ethics Committee, western zone (protocol number 
2006/54); Sydney Local Health District Human Research Ethics Committee 
(X11-0220); Northern Sydney Central Coast Health Harbour Human Research 
Ethics Committee (0612-251M); Royal Adelaide Hospital Human Research Ethics 
Committee (091107a); Metro South Human Research Ethics Committee (09/ 
QPAH/220); South Metropolitan Area Health Service Human Research Ethics 
Committee (09/324); Southern Adelaide Health Service/Flinders University 
Human Research Ethics Committee (167/10); Sydney West Area Health Service 
Human Research Ethics Committee (Westmead campus) (HREC2002/3/4.19); 
The University of Queensland Medical Research Ethics Committee (2009000745); 
Greenslopes Private Hospital Ethics Committee (09/34); North Shore Private 
Hospital Ethics Committee. Johns Hopkins Medical Institutions: Johns Hopkins 
Medicine Institutional Review Board (NA00026689). ARC-Net, University of 
Verona: approval number 1885 from the Integrated University Hospital Trust 
(AOUI) Ethics Committee (Comitato Etico Azienda Ospedaliera Universitaria 
Integrata) approved in their meeting of 17 November 2010 and documented by the 
ethics committee 52070/CE on 22 November 2010 and formalized by the Health 
Director of the AOUI on the order of the General Manager with protocol 52438 
on 23 November 2010. Ethikkommission an der Technischen Universitat Dresden 
(Approval numbers EK30412207 and EK357112012). 

Patient material acquisition and extraction. Samples were acquired through the 
Australian Pancreatic Cancer Genome Initiative (APGI) as part of the International 
Cancer Genome Consortium (ICGC). Informed consent was obtained from all 
subjects. Tissue dissection of primary material, RNA and DNA extraction was 
performed using previously published methods”. Tumour cellularity was esti- 
mated for each sample using a combination of qPure analysis of high-density 
SNP profiles and KRAS amplicon sequencing”. Primary tumours (n= 342) and 
41 patient-derived cell lines (representing low cellularity tumours) (Supplementary 
Table 1) underwent whole genome sequencing when tumour cellularity was >40% 
(mean coverage 75 x, n= 179), or deep-exome sequencing (mean coverage: 400 x, 
n= 204) for samples with a cellularity of 12-40%. 

Exome library preparation. Exome libraries were generated using the Illumina 
Nextera Rapid Capture Exome kit (Illumina, Part no. FC-140-1003) according to 
the standard manufacturer's protocol (part no. 15037436 Rev. A February 2013), 
except they were made in an automated high-throughput fashion using Perkin 
Elmer's Sciclone G3 NGS Workstation (Product no. SG3-31020-0300). Then 50 ng 
of gDNA was used as input for tagmentation followed by 10 cycles of PCR to 
produce sufficient library for exome capture. A total of 500 ng of each library was 
pooled as a 12-plex reaction for capture using Illumina’s Nextera Exome Oligo 
set. Following two rounds of capture, samples were finally subjected to 10 cycles 
of PCR to produce exome libraries ready for sequencing. Prior to sequencing, 
exome libraries were qualified via either the Perkin Elmer LabChip GX with the 
DNA High Sensitivity LabChip kit (Perkin Elmer, Part no. CLS760672), or the 
Agilent BioAnalyzer 2100 with the High Sensitivity DNA Kit (Agilent, Part no. 
5067-4626). Quantification of libraries for clustering was performed using the 
KAPA Library Quantification Kit - Illumina/Universal (KAPA Biosystems, Part 
no. KK4824) in combination with the Life Technologies Viia 7 real time PCR 
instrument. 

Whole-genome library preparation. Whole-genome libraries were generated 
using either the Illumina TruSeq DNA LT sample preparation kit (Illumina, Part 
no. FC-121-2001 and FC-121-2001) or the Illumina TruSeq DNA PCR-free LT 
sample preparation kit (Illumina, Part no. FC-121-3001 and FC-121-3002) accord- 
ing to the manufacturer's protocols with some modifications (Illumina, Part no. 
15026486 Rev. C July 2012 and 15036187 Rev. A January 2013 for the two different 
kits respectively). For the TruSeq DNA LT sample preparation kit, 1 jug of gDNA 
was used as input for fragmentation to ~300 bp, followed by a SPRI-bead clean up 
using the AxyPrep Mag PCR Clean-Up kit (Corning, Part no. MAG-PCR-CL-250). 
After end-repair, 3’ adenylation and adaptor ligation, the libraries were size- 
selected using a double SPRI-bead method to obtain libraries with an insert size 
~300 bp. The size-selected libraries were subjected to 8 cycles of PCR to produce 
the final whole-genome libraries ready for sequencing. For the TruSeq DNA PCR- 
free LT sample preparation kit, 1 1g of gDNA was used as input for fragmentation 
to ~350 bp, followed by an end-repair step and then a size-selection using the 
double SPRI-bead method to obtain libraries with an insert size ~350 bp. The size- 
selected libraries then underwent 3’ adenylation and adaptor ligation to produce 
final whole genome libraries ready for sequencing. Prior to sequencing, whole- 
genome libraries were qualified via the Agilent BioAnalyzer 2100 with the High 
Sensitivity DNA Kit (Agilent, Part no. 5067-4626). Quantification of libraries for 
clustering was performed using the KAPA Library Quantification Kit - Ilumina/ 
Universal (KAPA Biosystems, Part no. KK4824) in combination with the Life 
Technologies Viia 7 real time PCR instrument. 
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Total RNA library preparation. RNA-Seq libraries were generated using the 
Illumina TruSeq Stranded Total RNA LT sample preparation kit (with Ribo- 
Zero Gold) (Illumina, Part no. RS-122-2301 and RS-122-2302), according to 
the standard manufacturer's protocol (Part no. 15031048 Rev. D April 2013), 
except they were made in an automated high-throughput fashion using Perkin 
Elmer’s Sciclone G3 NGS Workstation (Product no. SG3-31020-0300). The 
ribosomal depletion step was performed on 1 1g of total RNA using Ribo-Zero 
Gold before a heat fragmentation step aimed at producing libraries with an 
insert size between 120-200 bp. cDNA was then synthesized from the enriched 
and fragmented RNA using SuperScript II Reverse Transcriptase (Invitrogen, 
Catalog no. 18064) and random primers. The resulting cDNA was converted 
into double-stranded DNA in the presence of dUTP to prevent subsequent 
amplification of the second strand and thus maintain the strandedness of 
the library. Following 3’ adenylation and adaptor ligation, libraries were sub- 
jected to 15 cycles of PCR to produce RNA-seq libraries ready for sequencing. 
Prior to sequencing, RNA-seq libraries were qualified via the Perkin Elmer 
LabChip GX with the DNA High Sensitivity LabChip kit (Perkin Elmer, Part 
no. CLS760672). Quantification of libraries for clustering was performed using 
the KAPA Library Quantification Kit - Ilumina/Universal (KAPA Biosystems, 
Part no. KK4824) in combination with the Life Technologies Viia 7 real time PCR 
instrument. 

Library sequencing. All libraries were sequenced using the Illumina HiSeq 
2000/2500 system with TruSeq SBS Kit v3 - HS (200-cycles) reagents (Illumina, 
Part no. FC-401-3001), to generate paired-end 101 bp reads. 

Sequence alignment and data management. Sequence data was mapped to the 
Genome Reference Consortium GRCh37 assembly using BWA42. All BAM files 
have been deposited in the EGA (accession number: EGAS00001000154). 

Copy number analysis. Matched tumour and normal patient DNA was assayed 
using Illumina SNP BeadChips as per manufacturer's instructions (Illumina, San 
Diego CA) (HumanOmnil-Quad or HumanOmni2.5-8 BeadChips) and analysed 
as previously described”. 

Identification and verification of structural variants. The Somatic structural 
variant pipeline were identified using the qSV tool. A detailed description of its 
use has been recently published”*. 

Identification of and verification of point mutations. Substitutions and indels 
were called using a consensus calling approach that included qSNP, GATK and 
Pindel. The details of call integration and filtering, and verification using orthog- 
onal sequencing and matched sample approaches are as previously described**’. 
97% of KRAS mutations identified by KRAS deep-amplicon sequencing were 
detected via WGS and WES, inferring a false negative rate of 3% (Supplementary 
Table 1). 

‘Lollipop’ plots. Plots showing the location and frequency of inactivating muta- 
tions were generated using the MutationMapper web tool hosted at http://www. 
cbioportal.org/. Available PanCancer mutation data was downloaded from the 
Cancer Genomic Data Server (CGDS) hosted by the Computational Biology 
Center (cBio) at the Memorial Sloan-Kettering Cancer Center (MSKCC) using 
the R package “cgdsr””’. 

Mutational signatures. Mutational signatures were defined for genome-wide 
somatic substitutions, as previously described’. 

Significantly mutated gene detection. A combination of three robust approaches 
were used to define significantly mutated genes: (i) MutSigCV2 (ref. 30), which 
detects genes with point mutations above the background mutation rate; 
(ii) OncodriverFM*', which detects point mutated genes with a bias towards path- 
ogenic mutations; and (iii) HOTNET2 (ref. 32), which identifies sub-networks 
based on protein-protein interactions that contain recurrent point mutations, 
copy number alterations and structural rearrangements. The HotNet2 (HotNet 
diffusion-oriented subnetworks) algorithm was used to identify significantly 
mutated subnetworks in a genome-scale interaction network Heat scores for each 
protein were calculated as the number of samples having a non-silent SNV, indel, 
SV or copy number aberration in the corresponding gene*”. Heat scores were 
limited to proteins having a corresponding gene mutation in >2% of samples. 
The iRefIndex interaction network was used for the analysis*’. Supplementary 
Table 20 contains matrices summarizing all mutations, CNVs and SVs for all 
samples used in this study. 

RNA sequencing library generation and sequencing. RNA-seq libraries were 
generated using TruSeq Stranded Total RNA (part no. 15031048 Rev. D April 2013) 
kits, using on a Perkin Elmer’s Sciclone G3 NGS Workstation (product no. SG3- 
31020-0300). Ribosomal depletion step was performed on 1 1g of total RNA using 
Ribo-Zero Gold before a heat fragmentation step aimed at producing libraries with 
an insert size between 120-200 bp. cDNA was then synthesized from the enriched 
and fragmented RNA using Invitrogen’s SuperScript II Reverse Transcriptase 
(catalogue number 18064) and random primers. The resulting cDNA was fur- 
ther converted into double stranded DNA in the presence of dUTP to prevent 
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subsequent amplification of the second strand and thus maintain the stranded- 
ness of the library. Following 3’ adenylation and adaptor ligation libraries were 
subjected to 15 cycles of PCR to produce RNA-seq libraries ready for sequencing. 
Prior to sequencing, exome and RNA-seq libraries were qualified and quantified 
via Caliper’s LabChip GX (part no. 122000) instrument using the DNA High 
Sensitivity Reagent kit (product no. CLS760672). Quantification of libraries for 
clustering was performed using the KAPA Library Quantification Kits For Illumina 
sequencing platforms (kit code KK4824) in combination with Life Technologies 
Viia 7 real time PCR instrument. 

RNA-seq analysis. Sequencing reads were mapped to transcripts corresponding to 
ensemble 70 annotations using RSEM™. RSEM data were normalized using TMM 
(weighted trimmed mean of M-values) as implemented in the R package ‘edgeR’ 
For downstream analyses, normalized RSEM data were converted to counts per 
million (c.p.m.) and log, transformed*°. Genes without at least 1 c.p.m. in 20% of 
the sample were excluded from further analysis. 

RNA-seq re-analysis of Weismuller et al. RNA sequencing data reported in 
ref. 25 was downloaded from the Sequence Read Archive (SRA): Accession num- 
ber; SRP033333. The available data was re-analysed using an RNA-seq pipeline 
implemented in the bcbio-nextgen project (https://bcbio-nextgen.readthedocs.org/ 
en/latest/). Briefly, after quality control and adaptor trimming, reads were aligned 
to the UCSC mouse mm10 genome build using STAR** Counts for known genes 
were generated using the function featureCounts in the R/Bioconductor package 
“Rsubread””, The R/Bioconductor package “DESeq2” was used to identify differ- 
entially expressed genes*®. 

KRAS Trp53+ and KRAS Trp53"+ Trp63/" mouse derived cell lines. Cell 
lines were generated in house from pancreatic tumours harvested from Pdx1-Cre, 
LSL-Kras@?*, Trp53’ * mice or Pdx1-Cre, LSL-Kras@?)*, Trp53” am TApo3ft 
mice described previously’. Low passage cell lines were used and authenticated 
by morphology. Mycoplasma testing confirmed that all cell lines were mycoplasma 
negative. Independently derived cell-lines representing either the KRAS Trp53/"* 
(n=3) or KRAS Trp53"* Trp63" (n= 3) genotype were used for RNA-seq anal- 
ysis. RNA-seq libraries were generated using the KAPA stranded RNaseq Kit with 
RiboErase (HMR) (KAPA Biosystems; kit ref. KR1151 - v3.15) according to the 
manufacturer's instructions. Briefly, samples were fragmented for 6 min at 94°C 
with 10 cycles of library amplification. Library quality control was performed using 
an Agilent BioAnalyzer 2100 in combination with a High Sensitivity DNA Kit 
(Agilent, Part no. 5067-4626). Samples were evenly pooled to a 2nM concentration 
and a 1% PhiX control spike-in was used for sequencing quality control. Libraries 
were run on the NextSeq 500 platform according to the manufacturer's instruc- 
tions (Illumina, San Diego CA). Sequenced libraries were mapped to UCSC mouse 
mm10 genome build using TopHat and differential gene expression determined 
using Cufflinks 2.1.1 and Cuffdiff 2.1.1 as implemented in BaseSpace (https:// 
basespace.illumina.com/home/indexIllumina, San Diego CA). 

Microarray analysis. Tumour RNA was assayed using HumanHT-12 v4 
Expression BeadChips as per manufacturer’s instructions (Illumina, San Diego 
CA) and analysed as previously described’. Batch correction was performed using 
the R package ‘sva’’. 

Clustering. Non-negative matrix factorization (NMF) was employed to identify 
stable sample clusters*’ The top 2,000 most variable genes were used as input. 
NMF parameters: Brunet algorithm; k= 1 to k=7 clusters; number of clusterings 
to build consensus matrix = 20; error function = Euclidean; and 500 iterations. 
The preferred clustering result was determined using the observed cophenetic 
correlation between clusters and the average silhouette width of the consensus 
membership matrix as determined by the R package ‘cluster. The R package 
‘ConsensusClusterPlus”! was also employed to verify sample clustering. Similar 
sample clusters were obtained using both methods (data not shown). The pack- 
age ‘ConsensusClusterPlus’ was also used to subtype PC samples according to the 
expression signatures defined in Moffitt et al.?* 

Differential gene expression (DGE). To identify the most representative samples 
within each cluster, we computed silhouette widths using the R ‘cluster’ package. 
Samples with positive silhouette widths were retained for DGE analysis. DGE 
analysis between representative samples was performed using the function ‘voon’ 
as implemented in the R package ‘edgeR. To define genes differentially expressed 
between all classes we used the function ‘sam as implemented in the R package 
‘siggenes. 

Gene sets. Gene sets representing immune cell-type expression markers and 
immune meta-genes were obtained from a recent publication’. Gene sets repre- 
senting PDAC classes were generated by selecting significantly upregulated genes 
in a given class versus all other classes. An adjusted P value of 0.01 was used as the 
cut-off in each case. 

Gene set enrichment. Gene set enrichment was performed using the R package 
‘GSVA (function gsva - arguments: method = “gsva’, mx.diff = TRUE)". GSVA 
implements a non-parametric unsupervised method of gene set enrichment 


that allows an assessment of the relative enrichment of a selected pathway across 
the sample space. The output of GSVA is a gene-set by sample matrix of GSVA 
enrichment scores that are approximately normally distributed. GSVA enrichment 
scores were generated for each gene set using the transformed RSEM data unless 
otherwise indicated. For survival analyses, sample GSVA enrichment scores were 
stratified into quantiles (for example, lower 33% or upper 66% of values). 
WGCNA. Weighted gene co-expression network analysis (WGCNA) was used to 
generate a transcriptional network from the normalized and transformed RSEM“. 
Briefly, WGCNA clusters genes into network modules using a topological overlap 
measure (TOM). The TOM is a highly robust measure of network interconnect- 
edness and essentially provides a measure of the connection strength between two 
adjacent genes and all other genes in a network. Genes are clustered using 1-TOM 
as the distance measure and gene modules are defined as branches of the resulting 
cluster tree using a dynamic branch-cutting algorithm*®. 

The module eigengene is used as a measure of module expression in a given 
sample and is defined as the first principle component of a module. To relate sam- 
ple traits of interest to gene modules, sample traits were correlated to module 
eigengenes and significance determined by a Student asymptotic P value for the 
given correlations. For gene module survival analyses, module eigengenes were 
stratified into quantiles (for example, lower 33% or upper 66% of values). To relate 
gene modules to PDAC classes, PDAC class gene set GSVA enrichment scores were 
used as sample traits and correlated with the module eigengenes as discussed above. 
Similarly, to relate the immune cell-type expression markers and immune meta- 
genes to the gene modules each immune GSVA enrichment score was correlated 
with the module eigengenes as before. 

To determine the enrichment of differentially expressed mouse genes in mod- 
ules generated by WGCNA, mouse identifiers were first mapped to their corre- 
sponding human HGCN Symbol using the R/Bioconductor package “biomaRt”. 
Module gene enrichment was then determined using the function userListEnrich- 
ment in the WGCNA package. We considered, as significant, only those modules 
showing both significant enrichment and significant gene expression/gene module 
correlations. 

Pathway analysis. Ontology and pathway enrichment analysis was performed 
using the R package ‘dnet’“® and/or the Reactome FI Cytoscape plugin 4.1.1 
(ref. 47) as indicated. The R package ‘dnet’ was also used to identify significant 
sub-networks of differentially expressed genes. 

Pan-cancer 12 data and squamous assignment. Platform corrected input data was 
obtained from Synapse as part of the Pan-Cancer 12 data freeze (syn1715755)'8. 
Pan-cancer 12 subtype assignments were also obtained from Synapse (syn1889916) 
and sample sizes, as indicated, used for statistical comparisons. To determine the 
relationship between the PDAC classes and the pan-cancer 12 subtypes, PDAC 
class gene sets were used in combination with the pan-cancer 12 expression data to 
generated GSVA enrichment scores as discussed above. Sample GSVA enrichment 
scores representing each PDAC class were then stratified according to the pan- 
cancer 12-subtype assignments. A Kruskal-Wallis test was applied to the stratified 
scores to determine whether the distributions differed. 

Methylation analysis. Sample methylation was determined using Illumina 450K 
arrays as previously described’ with the following modifications. Probe-level 
Illumina GeneStudio output files were imported into R package ‘lumi’* and data 
filtered to remove failed hybridizations, probes comprising SNPs and probes 
located on sex chromosomes. The filtered methylation values were then colour 
balance corrected and normalized using Shift and Scaling Normalization (SSN) 
as implemented by lumi. Gene methylation values were obtained by collapsing 
probe level values for a given gene loci (that is, probes located 1,500 bp upstream 
of the transcriptional start site (TSS) through to the end of transcription) using 
the function collapseRows(method = *maxRowVariance”) from the package 
WGCNA“. Probes were mapped to gene loci using the R package “genomic- 
Features”. Differential gene methylation between representative samples (selected 
as above under heading Differential gene expression (DGE)) was determined using 
the R package ‘limma. M-values were used for differential gene methylation analy- 
sis. Concordant changes in methylation and expression were calculated as follows. 
Probes were mapped to a given gene using the R package “genomicFeatures”. As 
above, probes located 1,500 bp upstream of the TSS through to the end of tran- 
scription were considered for each gene. The correlation between a probe {-value 
and the corresponding gene log, (CPM) expression value was then calculated using 
Pearson's correlation coefficient. The statistical significance of each probe/gene 
correlation was calculated by permuting the data 10,000 times and comparing 
the correlation coefficients obtained before and after permutation. The resulting 
P values were adjusted for multiple testing using the approach of Benjamini and 
Hochberg. 

Concordant copy number expression analysis. Analysis of variance was used 
to identify significant changes in gene expression between samples exhibiting 
corresponding gene copy number aberrations. Accordingly, gene expression 
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values were stratified on the basis of sample copy number change for a given gene. 
Deleted genes and genes having copy number >8 were considered for the analysis. 
P values were adjusted using the R package ‘qvalue’ for which adjusted values <0.05 
were considered statistically significant. Variance was similar between groups 
compared. Genes showing copy number aberrations in less than 3% of samples 
were excluded from the analysis. 

Survival analysis. The date of diagnosis and the date and cause of death were 
obtained from the Central Cancer Registry and treating clinicians. Median 
survival was estimated using the Kaplan-Meier method and the difference 
was tested using the log-rank test. P values of less than 0.05 were considered 
statistically significant. Clinicopathologic variables analysed with a P value 
<0.25 on log-rank test were entered into Cox proportional hazards multivar- 
iate analysis. Statistical analysis was performed using StatView 5.0 Software 
(Abacus Systems, Berkeley, CA, USA). Disease-specific survival was used as 
the primary endpoint. 

Stromal cell and immune infiltrate quantification. To quantify stromal and 
immune cell tumour contributions we used the R package ‘estimate’. 

Statistical analysis. A Kruskal-Wallis test was applied to the indicated stratified 
scores to determine whether distributions were significantly different. For all 
PDAC class comparisons using RNA-seq data, the following sample sizes were 
compared: ADEX (n= 14), immunogenic (n= 24); squamous (n = 20); and pan- 
creatic progenitor (n= 25). Fisher's exact tests were used to evaluate the association 
between dichotomous variables. 

Data. Data presented in this study can be downloaded from the following reposi- 
tory https://dcc.icgc.org/repositories under the identifier PACA-AU. 
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Extended Data Figure 1 | Mutational landscape of PC. a, Barplot 
representing the somatic mutation rate for each of the 456 samples 
included in this analysis. b, Non-silent mutations (blue), amplifications 
(>8 copies, red), deletions (purple) and structural variants (SV, green) 
ranked in order of exclusivity. c, Significantly mutated genes identified by 
OncodriverFM. An asterisk denotes a significantly mutated gene identified 
by both MutSigCV and OncodriverFM. d, PC mutation functional 
interaction (FI) sub-network identified by the ReactomeFI cytoscape 
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Extended Data Figure 2 | Selected genomic events in PC. a, Lollipop 
plots showing the type and location of mutations in the RNA processing 
genes RBM10, SF3B1 and U2AF1 and the tumour suppressor TP53. In each 
plot, mutations observed across multiple cancers (top plot; PanCancer) are 
compared with those observed in the current study (bottom plot; PDAC). 
Significant recurrent mutations are labelled above the relevant lollipop. 

b, Regions of copy number alteration showing concordant gene expression 
changes. For each of the indicated chromosomes, significant GISTIC peaks 
are shown at their respective genomic locations (x axis) as grey bars. Each 
gene is represented by a dot at its specific chromosomal coordinate, with 


ME2 
ELAC1 
SMAD4 
MEX3C 
— 


40407 60407 80407 


Genomic location (bp) 


UQCRFS1 
POP4 
PLEKHF1 
C190rf12 
CCNE1 
URI1 AKT2 
== = 


40407 62107 


Genomic location (bp) 
blue representing concordant copy number loss and gene downregulation 
and red representing concordant copy number amplification (copy 
number > 8) and gene upregulation. Significance of concordant copy 
number/expression change is measured as a value of —logio (q-value) 
times the sign of the direction of change. Dotted lines represent a 
significance threshold of —log)y (q-value = 0.05) times the sign of the 
direction of change. Genes showing concordant copy number/expression 
changes and overlapping GISTIC peaks are listed above the plot. Asterisk 
denotes known PC oncogenes showing amplification but non-significant 
concordant copy number/expression change. 
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Extended Data Figure 3 | Classification of PC into 4 classes. 

a, Unsupervised classification of PC RNAseq using NME. Solutions are 
shown for k=2 to k=7 classes. A peak cophenetic correlation is observed 
for k=4 classes. b, Silhouette information for k = 4 classes. c-e, Boxplots 
representing QPURE, stromal signature scores and immune signature 
scores stratified by class. Boxplots are annotated by a Kruskall-Wallis 
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Extended Data Figure 4 | Identification of 4 robust PC classes 

in 232 PCs with mixed low and high cellularity. a, Unsupervised 
classification of PC expression array data representing 232 samples 
using NMEF. Solutions are shown for k= 2 to k=7 classes. b, Silhouette 
information for k = 4 classes. c, Heatmap showing differential gene 
expression between classes. d, Boxplots representing QPURE, stromal 
signature scores and immune signature scores stratified by class. 


e, Boxplots representing ADEX, pancreatic progenitor, squamous and 
immunogenic signature scores defined using the RNA-seq PC set 
stratified by class. Boxplots in d and e are annotated by a Kruskall-Wallis 
P value. For comparisons the following sample sizes were used: ADEX 
(n= 49); immunogenic (n = 67); squamous (n = 71); and pancreatic 


progenitor (n=45). 
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Extended Data Figure 5 | Characterization of PC subtypes. a, Heatmap 
showing the statistical significance of correlations observed between the 
expressions of genes significantly expressed in each PC class and gene 
programmes identified by WGCNA. Pearson correlations and Student's 
asymptotic P values are provided in each cell. b, Principal component 
analysis (PCA) using methylation data. Plot showing pairwise comparisons 
of samples distributed along the identified principle components (PC). 
Adjacent non-tumorous pancreatic samples represented as green points 
cluster as a distinct group. PC samples represented by points coloured 
brown (ADEX), blue (squamous), orange (pancreatic progenitor) and red 
(immunogenic) cluster together. c, Venn diagram showing the number 
of common and unique genes differentially methylated in the indicated 
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is observed that distinct subsets of genes are differentially methylated 

in the 4 PC subtypes. d, Heatmap showing genes that are significantly 
methylated between tumours comprising the squamous class and all other 
classes. Methylation values for the same genes in adjacent non-tumorous 
pancreas are also shown. e-h, Plots showing regulation of gene expression 
by methylation. Hyper- or hypomethylation of the indicated probe is 
associated with either the concordant downregulation or upregulation of 
the indicated gene. Pearson correlation and adjusted P values are provided 
for each gene methylation comparison. Boxplot colours designate class: 
ADEX (brown); immunogenic (red); squamous (blue); and pancreatic 
progenitor (orange). Single letter designations representing the first letter 


are provided under the relevant boxes in each plot. 
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Extended Data Figure 6 | Core gene programmes (GP) defining the 
squamous class. Each panel shows from left to right: (i) a heatmap 
representing the genes in the specified gene programme most correlated 
with the indicated PC class with tumours ranked according to their gene 
programme module eigengene values (MEs) (PC classes are designated 
by colour as follows: ADEX (brown); pancreatic progenitor (orange); 
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immunogenic (red); and squamous (blue)); (ii) Kaplan-Meier analysis 
comparing survival of patients having either high or low gene programme 


MEs; (iii) pathways significantly enriched in a given GP functional 


interaction (FI) sub-network defined by the ReactomeFI Cytoscape plugin. 


P values represent FDR < 0.05. 
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Extended Data Figure 7 | Gene programme defining the pancreatic patients having either high or low GP1 MEs; (iii) pathways significantly 
progenitor class. a, Panel showing from left to right: (i) a heatmap enriched in a GP1 FI sub-network defined by the ReactomeFI Cytoscape 
representing the genes in GP1 most correlated with the pancreatic plugin. P values represent FDR <0.05. b, Network diagram depicting 
progenitor class with tumours ranked according to their GP1 module pathways significantly enriched in GP1 (FDR <0.0001). Different node 


eigengene values (MEs); (ii) Kaplan-Meier analysis comparing survival of colours indicate different network clusters or closely interconnected genes. 
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Extended Data Figure 8 | Gene programmes defining the ADEX class. 

a, b, Panel showing from left to right: (i) a heatmap representing the genes 
in the specified GP most correlated with the ADEX class with tumours 
ranked according to their GP module eigengene values (MEs); (ii) Kaplan— 
Meier analysis comparing survival of patients having either high or low GP 
MEs; (iii) pathways significantly enriched in a GP FI sub-network defined 
by the ReactomeFI Cytoscape plugin. P values represent FDR <0.05. 

c, Network diagram depicting pathways significantly enriched in GP9 


Retrograde endocannabinoid signaling(K) Pval <1.000e-04 GP10 Gene Significance 


Correlation 


(FDR <0.0001). Different node colours indicate different network clusters 
or closely interconnected genes. Genes comprising GP9 are indicated as 
coloured circles, whereas linker genes (genes not comprising GP9 but 
forming multiple connections in the network) are indicated as coloured 
diamonds. d, Network diagram depicting pathways significantly enriched 
in GP10 (FDR <0.0001). Different node colours indicate different network 
clusters or closely interconnected genes. 
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Extended Data Figure 9 | Stratification of PC RNASeq data 

according to Moffitt et al. a, Heatmap showing the stratification of the 
PC cohort of the current study using the tumour subtype classifier 
published in Moffitt et al.8. PCs were classified by consensus clustering 
using the top 50 weighted genes associated with the basal-like or classical 
subtypes. b, Boxplots showing the distribution of normal and activated 
stroma signature scores between the 4 PC classes identified in the current 
study. Boxplots are annotated by a Kruskall-Wallis P value. A significant 
difference in activated stroma signature scores was observed between 
squamous and ADEX tumours P value < 0.01 (t-test). Boxplot colours 
designate class: ADEX (brown); immunogenic (red); squamous (blue); 
and pancreatic progenitor (orange). c, Plots showing correlation between 
tumour cellularity, presented as a QPURE score, and either activated 

or normal stroma signature scores. Plots are annotated with Pearson 
correlation scores and significance values, with a linear fit represented by a 
solid line. Sample ICGC_0338, a rare acinar cell carcinoma is highlighted. 
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ICGC_0338 ADEX gene expression Log2(CPM) 


This sample exhibits near 100% cellularity and has low activated or 
normal stroma signature scores. d, Principal component analysis (PCA) 
using methylation data. Plot showing pairwise comparisons of samples 
distributed along the identified principle components (PC). Adjacent 
non-tumorous pancreatic samples represented as green points cluster as 
a distinct group relative to ADEX samples (brown and red points). Rare 
acinar cell carcinomas (red) cluster with other ADEX samples (brown). 
All other PC samples are shown as grey points. e, Plot showing the 
correlation of expression of representative genes expressed in acinar 

cell carcinoma sample ICGC_0338 compared to the median expression 
of the same genes across all other ADEX samples. A red shaded region 
encompasses genes showing high median expression in all other ADEX 
but low expression in ICGC_0338. A brown shaded region encompasses 
genes showing high median expression in all other ADEX and correlatively 
high expression in ICGC_0338. Pearson's correlation and significance are 
indicated. 
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Extended Data Figure 10 | Gene programmes defining the 
immunogenic class. a-c, Each panel shows from left to right: (i) a 
heatmap representing the genes in the specified gene programme most 
correlated with the indicated PC class with tumours ranked according 
to their gene programme module eigengene values (MEs). PC 

classes are designated by colour as follows: ADEX (brown); pancreatic 
progenitor (orange); immunogenic (red); and squamous (blue); 

(ii) Kaplan-Meier analysis comparing survival of patients having either 
high or low gene programme MEs; (iii) pathways significantly enriched 


onan 


A | S P A Jt S P 


in a given GP functional interaction (FI) sub-network defined by the 
ReactomeFI Cytoscape plugin. Corresponding Cytoscape files comprising 
GP ReactomeFI subnetworks are provided. d, Boxplot of immune gene 
expression stratified by class. Boxplots are annotated by a Kruskall-Wallis 
P value and box colours designate class: ADEX (brown); immunogenic 
(red); squamous (blue); and pancreatic progenitor (orange). Single letter 
designations representing the first letter of each class are provided under 
the relevant boxes in each plot. 
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High-fat diet enhances stemness and 
tumorigenicity of intestinal progenitors 


Semir Beyaz!**, Miyeko D. Mana!*, Jatin Roper!3*, Dmitriy Kedrin!*, Assieh Saadatpour°, Sue-Jean Hong®, 
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Monther Abu-Remaileh'®, Maria M. Mihaylova!®, Dudley W. Lamming®, Rizkullah Dogum!, Guoji Guo?, George W. Bell®, 
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Little is known about how pro-obesity diets regulate tissue stem and progenitor cell function. Here we show that high-fat 
diet (HFD) -induced obesity augments the numbers and function of Lgr5* intestinal stem cells of the mammalian intestine. 
Mechanistically, a HFD induces a robust peroxisome proliferator-activated receptor delta (PPAR-5) signature in intestinal 
stem cells and progenitor cells (non-intestinal stem cells), and pharmacological activation of PPAR-5 recapitulates the 
effects of a HFD on these cells. Like a HFD, ex vivo treatment of intestinal organoid cultures with fatty acid constituents 
of the HFD enhances the self-renewal potential of these organoid bodies in a PPAR-5-dependent manner. Notably, 
HFD- and agonist-activated PPAR-5 signalling endow organoid-initiating capacity to progenitors, and enforced PPAR-5 
signalling permits these progenitors to form in vivo tumours after loss of the tumour suppressor Apc. These findings 
highlight how diet-modulated PPAR-6 activation alters not only the function of intestinal stem and progenitor cells, but 


also their capacity to initiate tumours. 


The mammalian intestine is known to respond to dietary signals’. 
Lgr5* intestinal stem cells (ISCs) remodel intestinal composition in 
response to diet-induced cues by adjusting their production of daughter 
stem cells and (non-ISC, transit-amplifying cells) progenitor cells, the 
latter of which differentiate into the diverse cell types of the intestine’. 
The Lgr5* ISCs reside at the base of intestinal crypts adjacent to Paneth 
cells, which are a central component of the ISC niche and regulate stem- 
cell biology in response to calorie-restricted diets, 

Although important epidemiological and rodent studies link obesity 
to colon cancer incidence*>~’, little is known about how the adaptation 
of stem and progenitor cells to pro-obesity diets alters the potential of 
these cells to initiate tumours’. In the mouse intestine, Ler5* ISCs serve 
as the cell-of-origin for the precancerous adenomatous lesions caused 
by loss of the Apc tumour suppressor gene; yet, it is unclear whether 
this occurs in the context of obesity-linked intestinal tumorigenesis*”. 
Here, we interrogate how long-term HFD-induced obesity influences 
intestinal stem and progenitor cell function and the cellular origins of 
intestinal dysplasia. 


HFD boosts ISC counts and crypt function 

To assess the effects of obesity on intestinal homeostasis, we maintained 
mice on a long-term HFD (60% fat diet; Extended Data Fig. 1o) for 
9-14 months, which is sufficient to observe many of the metabolic 
phenotypes associated with obesity'®'!. Consistent with previous 
reports, HFD-fed mice gained considerably more mass than their stand- 
ard chow-fed counterparts (Extended Data Fig. 1a). While the small 
intestines from HFD-fed mice were shorter in length (Extended Data 
Fig. 1c) and weighed less (Extended Data Fig. 1b), there was no change 
in the density of crypt-villous units (Extended Data Fig. 1d) or the 


number of apoptotic cells (Extended Data Fig. In). Morphologically, 
a HFD led to a mild reduction in villi length (Extended Data Fig. 1g), 
an associated decrease in villous enterocyte numbers (Extended Data 
Fig. 1f), and an increase in crypt depth (Extended Data Fig. le). A HFD 
did not change the numbers of chromogranin A* enteroendocrine cells 
or Alcian blue* goblet cells per crypt-villus unit of the small intestine 
(Extended Data Fig. 2a—d). 

To address how a HFD affects the frequency of ISCs, we performed 
in situ hybridization for olfactomedin 4 (Olfm4), a marker expressed 
by the Lgr5* ISCs!*. Compared to mice fed a standard chow diet, 
those on a HFD had a 50% increase in the number of Olfm4t ISCs 
(Fig. la and Extended Data Fig. 11). By contrast, a HFD reduced 
cryptdin 4* (Crp4*, also known as Defa4*) niche Paneth cell numbers 
by 23% (Fig. la and Extended Data Fig. 1m). These observations lead to 
two conclusions: first, a HED enhances ISC numbers and self-renewal 
(for example, deeper crypts with more Olfm4* ISCs) at the expense of 
differentiation (shorter and less cellular villi); and, second, the increase 
in ISCs occurs despite a reduction in Paneth cell numbers, raising the 
possibility that under a HFD, ISCs adjust to fewer interactions from 
their Paneth cell niche. 

Given that ISC numbers and proliferation (Fig. 1b, Extended Data 
Figs 1h-k and 3f and Supplementary Information) increase in a HFD, 
we asked whether a HFD also boosts intestinal regeneration. Using an 
in vitro approach, we assessed the ability of isolated intestinal crypts to 
form organoid bodies in 3-D culture. These organoids recapitulate the 
epithelial architecture and cellular diversity of the mammalian intes- 
tine and are a proxy for ISC activity, as only stem cells can initiate and 
maintain these structures long-term!'?, HFD-derived crypts from the 
small intestine and colon were more likely to initiate mini-intestines in 
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Figure 1 | HFD augments ISC numbers and function. a, Quantification 
of Olfm4* ISCs (n=3) and Crp4* Paneth cells (n =6) in the proximal 
jejunum of control (C) and HFD-fed mice by in situ hybridization. b, BrdU 
incorporation in ISCs (crypt base columnar cells) and progenitors 
(transit-amplifying cells) after a 4-h pulse (n = 6). c-e, Organoid per crypt 
(c, n=4) and crypt domain (d, n =7) quantification from control and 
HFD-fed mice (d, n = 4). Representative images: day-7 organoids (e). Arrows 
denote organoids; asterisks denote aborted crypts. f, Number of secondary 
organoids per dissociated crypt-derived primary organoids (n= 9 primary 
organoids, 3 primary organoids per sample were individually subcloned 

in 3 independent experiments). g, h, Frequencies of ISCs (Lgr5-GFP™, 
dark green) and progenitors (Lgr5-GFP!”, light green) in the entire small 
intestine (g, 1 = 10) and colon (h, n= 8) as measured by flow cytometry. 


culture than those from controls (Fig. 1c, eand Extended Data Fig. 3)). 
Furthermore, these organoids were more cystic (that is, less differ- 
entiated'*) in structure and contained fewer crypt domains (Fig. 1d). 
When sub-cloned, HFD-derived primary organoids generated more 
secondary organoids (Fig. 1f and Extended Data Fig. 3k). Consistent 
with these findings, HFD crypt-derived organoids had higher frequen- 
cies of Lgr5* ISCs compared to controls (Extended Data Fig. 4a, d, e), 
and possessed diverse intestinal cell types, such as Paneth cells, ISCs, 
enteroendocrine cells and goblet cells (Extended Data Fig. 4b-f). 

To determine whether a HFD also augments crypt regeneration 
in vivo, we performed a clonogenic microcolony assay to test for ISC 
activity). After administration ofa lethal dose of irradiation, HFD-fed 
mice manifested increased numbers of surviving, proliferating crypts 
(Ki67* cells per crypt) that possessed more Olfm4* ISCs per unit length 
of intestine relative to controls (Extended Data Fig. 2e-g). These data 
support the notion that a HFD boosts the numbers and regenerative 
capacity of ISCs in vitro and in vivo. 


HFD reduces the niche dependence of ISCs 
To assess the effects of a HFD on ISCs and progenitors, we used Lgr5- 
EGFP-IRES-CreERT2 knock-in mice for the quantification and isola- 
tion of green fluorescent protein (GFP)-expressing ISCs (Lgr5-GFP") 
and progenitor cells (Lgr5-GFP!W)?. Compared to controls, mice on 
a HFD had an increased frequency of Lgr5-GFP"™ ISCs in the small 
intestine (Fig. 1g) and colon (Fig. 1h and Extended Data Fig. 3g). 

The opposing effects of a HFD on ISC and Paneth cell numbers led 
us to ask whether a HFD alters ISC function and niche dependence. 
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C_HFD HFSC HFSC © HFD HFSC HFSC 
tw im tw im 
i, j, Organoid-initiating capacity of control and HFD ISCs cultured with/ 
without Paneth cells (i, n = 4). Representative images: day-5 primary 
organoids (arrows, j). k, Number of secondary organoids per dissociated 
ISC-derived primary organoid (n =4). 1, m, Crypts (1) and ISCs (m) 
isolated from HFD-fed mice that were reverted to a standard chow diet 
(HFSC) retained augmented organoid-forming capacity for 1 week (w) 
(red; n = 4) but not for 1 month (m) (blue; n= 4) when compared to their 
HED counterparts (n= 6 crypts, n= 4 ISCs). Unless otherwise indicated, 
data are mean +s.d. from n independent experiments. *P < 0.05, 
**P < 0.01, ***P < 0.001 (Student’s t-tests). Scale bars, 201m (a, b) 
and 100\m (e, j). Histological analysis: a, Olfm4: 10 crypts per group, 
Crp4: 50 crypts per group; b, 50 crypts per group in each experiment. 


We assayed the clonogenic potential of ISCs from control and 
HFD-fed mice either alone or in combination with the niche 
Paneth cells!. Consistent with earlier studies!*!%, control ISCs 
by themselves inefficiently formed organoids, but robustly 
formed organoids when co-cultured with Paneth cells (Fig. 1i). 
Surprisingly, HFD-derived ISCs alone (without Paneth cells) had 
an increased capacity to initiate organoids with multilineage dif- 
ferentiation and more secondary organoids than control ISCs 
(Fig. li-k and Extended Data Fig. 4h, i, 1, m). Co-culture with Paneth 
cells further increased the organoid-initiating activity of HFD ISCs 
(Fig. 1i). Organoids derived from control and HFD ISCs alone effec- 
tively produced Paneth cells within 24h of culture (Extended Data 
Fig. 4j, k). Furthermore, crypts and ISCs isolated from mice that had 
been on a HED, but were returned to a standard chow diet, retained 
an enhanced capacity to initiate organoids for more than 7 days but 
less than 4 weeks, indicating that the effects of a HFD are reversible 
(Fig. 11, m). These data, together with the observation that a HFD 
uncouples the in vivo expansion of ISCs from their Paneth cell niche, 
suggest ISCs undergo autonomous changes in response to a HFD that 
poise them for niche-independent growth in the organoid assay. 


Fatty acids drive organoid self-renewal 

To address whether dietary constituents of the HFD can recapitu- 
late aspects of the HFD-evoked stem-cell phenotype, we expanded 
control organoids in crypt media supplemented with palmitic acid, 
a main component of the HFD’*. Treatment with palmitic acid did 
not alter the clonogenic potential of control crypts in primary culture 
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Figure 2 | Ex vivo exposure of intestinal organoids to palmitic acid 
recapitulates aspects of a HFD. a, Clonogenicity of naive crypts cultured 
with 301M palmitic acid (PA) in primary organoid cultures (n= 5). 
Representative images: day-4 organoids. V, vehicle. b, c, Secondary 
organoid formation of 1,000 sorted live primary organoid cells after 

4 weeks of 30|1.M palmitic acid treatment (b, n = 3). Representative 
images: day-4 secondary organoids (arrows, c). d, e, Frequency (d) and 
organoid initiation (e) of ISCs (Lgr5-GEP") after 4 weeks of 301M 
palmitic acid exposure (d, n = 3; e, n=4). Unless otherwise indicated, data 
are mean + s.d. from n independent experiments. *P < 0.05, **P<0.01 
(Student’s t-tests). Scale bars, 100 1m (a, c) and 50\1m (c, inset). 


(Fig. 2a). However, as observed with organoids from HFD-fed mice, 
primary organoids exposed ex vivo to palmitic acid gave rise to more 
secondary organoids than controls (Fig. 2b, c and Extended Data 
Fig. 5a). Consistent with these findings, organoids treated with palmitic 
acid possessed nearly twofold more Lgr5* ISCs (Fig. 2d and Extended 
Data Fig. 5b), and manifested reduced niche dependence in the orga- 
noid assay (Fig. 2e). Similar results were obtained with other fatty acids 
such as oleic acid and a lipid mixture in mouse and human intestinal 
organoids (Fig. 3h-k and Extended Data Fig. 5c-f). These findings 
indicate that key dietary constituents of a HFD are sufficient to reca- 
pitulate aspects of the in vivo HFD stem-cell phenotype. 


HFD acts through PPAR-5 in ISCs 

To gain mechanistic insight into how HFD mediates these effects, we 
performed messenger RNA sequencing on isolated Lgr5-GFP™ ISCs 
and Lgr5-GFP!™ progenitor cells from control and HFD-fed mice, 
respectively (Extended Data Fig. 6p). Gene set enrichment analysis 
(GSEA) pathway and transcription factor binding motif analyses 
revealed enrichment for transcriptional targets and binding motifs 
of the nuclear receptor peroxisome proliferator-activated receptor 
(PPAR) family and PPAR heterodimeric binding partners liver/retinoid 
x receptor!” (LXR/RXR; Extended Data Fig. 6c, d). Three members 
(a, andy) comprise the PPAR family'’; among these, PPAR-6 (Ppard) 
is the predominant one expressed in intestinal stem and progenitor 
cells at the mRNA level in control and HFD-fed mice (Extended Data 
Fig. 6a, b). Therefore, we focused our attention on PPAR-6 and its 
potential role in coupling a HFD to ISC function. 

Although PPAR-6 expression itself did not substantially increase 
(Extended Data Fig. 6a, b), the HFD robustly induced expression of 
many of its target genes at the mRNA levels in both the small intestine 
(Extended Data Fig. 6e) and colon (Extended Data Fig. 3h). The induc- 
tion of the PPAR-6 program was verified at the protein level in ISCs 
and progenitors (Fig. 3a). To address functionally whether engagement 
of a PPAR-6 program mimics the HFD, we administered the PPAR-& 
agonist GW501516 for 4 weeks to Lgr5-EGFP-IRES-CreERT2 mice*™!. 
Treatment led to strong induction of PPAR-6 target proteins in ISCs 
and progenitors (Fig. 3a). Furthermore, agonist-activated PPAR-5 sig- 
nalling augmented the in vivo frequencies of Olfm4* and Lgr5* ISCs 
(Fig. 3b, e and Extended Data Fig. 6f) and proliferation of stem and 
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progenitors cells (Fig. 3c), but had no effect on Paneth cell numbers 
(Fig. 3b and Extended Data Fig. 6g). Notably, small intestinal (Fig. 3d) 
and colonic (Extended Data Fig. 31, m) crypts from agonist-treated 
mice initiated more organoids than those from vehicle-treated mice. 
Similar to ISCs from HFD-fed mice, ISCs derived from agonist-treated 
mice were more effective at Paneth cell-independent organoid- 
initiation than their control counterparts (Fig. 3f). In addition, orga- 
noids exposed to the PPAR-4 agonist had more Lgr5* ISCs (Fig. 3g) 
and more self-renewing capacity in secondary assays (Fig. 3i). These 
data indicate that sustained PPAR-5 signalling largely recapitulates the 
effects of a HFD on ISC function. 

Because ex vivo fatty acids mimic aspects of a HFD, we asked whether 
this phenomenon occurs through PPAR-6 signalling. Like a HFD, we 
observed that ex vivo exposure of mouse and human organoids to fatty 
acids evokes a robust PPAR-5 program (Extended Data Figs 5g-j and 
6h). To assess the necessity of PPAR-6 in this response to fatty acids, 
we generated tamoxifen-inducible, intestine-specific Ppard conditional 
mice (Extended Data Fig. 6i, j). Acute ablation of Ppard in the intestine 
had no noticeable effects on the numbers, proliferation or function 
of ISCs and progenitor cells (Fig. 3h and Extended Data Fig. 6i-n). 
However, loss of Ppard blocked both the self-renewal enhancing effects 
of fatty acids and PPAR-4 agonist (Fig. 3i, j), as well as the induction of 
PPAR-& target gene expression in secondary organoid assays (Extended 
Data Fig. 60). These findings demonstrate that PPAR-6 mediates 
fatty-acid-driven organoid self-renewal. 


HFD and PPAR-$ raise 3-catenin activity 

Because HFD and PPAR-& activation confer increased stem- 
cell function, we asked whether these interventions regulate the 
Wnt/8-catenin pathway, which is required for ISC maintenance”. First, 
we observed more nuclear }-catenin, a proxy for its activity, in sorted 
ISCs and progenitors and on intestinal sections from HFD and PPAR-6 
agonist-treated mice compared to controls (Extended Data Fig. 7c-i). 
Second, crypts from HFD and PPAR-é agonist-treated mice required 
less exogenous Wnt for organoid maintenance than controls 
(Fig. 4a, b). Lastly, we found that increased levels of nuclear B-catenin 
associate with PPAR-é in HFD crypts (Extended Data Fig. 7j-1). 

To address how a HED and agonist-activated PPAR-S influence 
B-catenin transcriptional activity, we performed microfluidic-based 
multiplexed single-cell quantitative reverse transcription PCR (qRT- 
PCR) using primers for a curated list of known 6-catenin target genes 
that includes ISC markers (Supplementary Table 2 and Extended 
Data Fig. 8a, d). While a HFD did not alter expression of stem-cell 
signature genes (that is, Lgr5) that differ between stem and progen- 
itor cells?*74 (Extended Data Fig. 8b, e, g, h), it evoked expression of 
a subset of 3-catenin target genes such as Bmp4, Jag, Jag2 and Edn3 
in ISCs and progenitors (Fig. 4c, e, Extended Data Figs 3i and 8c, f, 
i, j and Supplementary Information). Single-cell (RT-PCR analysis 
confirmed that agonist-activated PPAR-6 also induced transcription 
of Bmp4, Jag1, Jag2 and Edn3 in ISCs and progenitors (Fig. 4d, f). We 
further validated Jag] expression by single-molecule in situ hybridiza- 
tion and found that it was broadly expressed within HFD (Extended 
Data Fig. 8k) and PPAR-4 agonist-treated crypts (Extended Data 
Fig. 81). Moreover, in response to in vitro fatty acids, PPAR-5 was 
required for the induction of Jag1 and Jag2 in secondary organoids 
(Extended Data Fig. 60). Collectively, these results support a model in 
which a HED activates a PPAR-5-mediated subset of 3-catenin target 
genes in ISCs and progenitor cells. 

To interrogate whether a similar program exists in an alternative 
model of obesity, we assessed how the intestine adapts to obesity in 
leptin receptor deficient (db/db) mice—an obesity model that devel- 
ops on a standard diet. Overall, we found that intestinal adaptation in 
the db/db obesity model was mostly opposite to what we observed in 
HFD-fed mice (Extended Data Fig. 9 and Supplementary Information). 
Such differences highlight that, even in obesity, diet affects ISC and 
progenitor biology. 
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Figure 3 | Activated PPAR-64 in ISCs mediates the effects of a HFD. 

a, Immunoblots of PPAR-6 target proteins in flow-sorted ISCs (Lgr5- 
GFP") and progenitors (Lgr5-GFP'™) from control, HFD, vehicle and 
GW501516 (GW) mice (n= 2). b, Quantification of Olfm4* ISCs (n= 4) 
and Crp4* Paneth cells (n = 3) by in situ hybridization in proximal jejunal 
crypts. c, BrdU incorporation in ISCs (crypt base columnar cells adjacent 
to Paneth cells) and progenitors (transit-amplifying cells not adjacent 

to Paneth cells) after a 4-h pulse (n =4). d, Organoid per crypt from 
vehicle- and GW501516-treated mice (n =3). e, Frequencies of flow-sorted 
ISCs (Lgr5-GFP") and progenitors (Lgr5-GFP!™) (n=5) from the entire 
small intestine of vehicle and GW501516-treated mice. f, Organoid- 
initiating capacity of ISCs derived from vehicle and GW501516-treated 
mice. Representative images: day-12 organoids (n=5). g, Frequency 

of ISCs (Lgr5-GFP*’) in organoids after 14 days of ex vivo GW501516 


PPAR-6 permits non-ISCs to beget tumours 

Somatic stem cells often accumulate the initial mutations that lead to 
oncogenic transformation**>-?’, We found that in a HFD there is a 
greater incidence of spontaneous intestinal low-grade dysplastic lesions 


exposure (n= 3). h-j, Primary (h, n=5) and secondary (i, j, n =5; 
normalized to vehicle) organoid-forming capacity of control (Ppard!/ 
(L, loxP)) and Ppard intestinal knockout (IKO) mice upon ex vivo 
treatment with vehicle, palmitic acid, lipid mixture (L) and GW501516. 
Representative images: day-4 secondary organoids (j). k, Normalized 
clonogenicity of human-derived intestinal organoids after ex vivo 
treatment with palmitic acid, lipid mixture and GW501516 in secondary 
culture (n = 4, see Methods). Unless otherwise indicated, data are 
mean + s.d. from n independent experiments. *P < 0.05, **P< 0.01, 
*** PD < 0.001 (Student’s t-tests). Scale bars, 20,1m (b, c), 200j1m (f) and 
100 1m (j). Histological analysis: b, Olfm4: 15 crypts per group, Crp4: 
50 crypts per group; ¢, 50 crypts per group in each experiment. 

For western blot source data, see Supplementary Fig. 1. 


(adenomas), carcinomas, or both than in controls (Fig. 5a—c), which 
may reflect the fact that there are more ISCs in mice on a HFD that 
can acquire oncogenic mutations. Because HFD-induced PPAR-6 also 
activates a 3-catenin signature in progenitor cells, we proposed that 
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Figure 5 | PPAR-6 activation confers organoid and tumour-initiating 
capacity to non-stem-cells. a, b, Representative spontaneous 

intestinal tumour from a HFD mouse: gross image (a) and microscopic 
haematoxylin and eosin (H&E) image (b). c, Incidence of spontaneous 
intestinal low-grade dysplastic lesions (adenoma) and carcinomas in 
control (m= 19) and HFD (n=25) mice. d, e, Organoid-initiating capacity 
of progenitors (Lgr5-GFP!™) from HED (n=7) and GW501516-treated 
(n=5) mice. Representative images: day-7 organoids (e). f, Schematic 
assessing in vitro and in vivo adenoma-initiating capacity of Apc-null 

ISCs (Lgr5-GFP") and progenitors (Lgr5-GEP!”) from vehicle- and 
GW501516-treated mice. Tam, tamoxifen. g, Numbers and representative 
day-5 images of adenomatous organoids from Apc-null ISCs (Lgr5-GFP™) 
and progenitors (Lgr5-GFP!") treated with/without GW501516 in 

EN media (EGF and Noggin only) (n= 6). h, Optical colonoscopy of 
tumours formed after orthotopic transplantation of 10,000 Apc-null ISCs 
(Lgr5-GFP"™) or Apc-null progenitors (Lgr5-GFP!") from vehicle- and 
GW501516-treated mice (n=5). i, Model of intestinal adaptation to HFD: 
mechanistically, HFD activates a PPAR-5-mediated program that augments 
the organoid- and tumour-initiating capacity of intestinal progenitors. 

A feature of the PPAR-6 program includes induction of a subset of 
68-catenin target genes. P, Paneth cell; Pr, progenitor cell; S, stem cell, 

T, tumour cell. Red dotted lines denote Apc-null cells with tumour- 
forming capability. Unless otherwise indicated, data are mean + s.d. from 
n independent experiments. *P < 0.05, **P < 0.01, ***P < 0.001 (Student's 
t-tests). Scale bars, 50m (b, g, bottom) and 200m (e, g, top). 


non-ISC progenitor populations can acquire stem-cell features and 
contribute to tumour initiation in diet-induced obesity*””. To explore 
this possibility, we asked whether HFD or agonist-activated PPAR-6 
influenced the organoid-initiating capacity of progenitors (non-ISCs). 
Interestingly, Lgr5-GFP!™ progenitors, but not terminally differenti- 
ated villous enterocytes, from HFD-fed and PPAR-6-agonist-treated 
mice formed organoids (Fig. 5d, e and Extended Data Figs 4g and 7a, 
b), raising the possibility that enforced PPAR-5 signalling in intestinal 
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progenitors not only bestows organoid-initiating capacity but also 
tumour-initiating potential. 

To test this possibility, we generated Apc!"; Lgr5-EGFP-IRES- 
CreERT2 mice to assess whether pharmacological PPAR-6 activation 
modifies the tumorigenic capacity of the Lgr5-GFP'” progenitors. 
Injection with tamoxifen leads to Apc loss in the Lgr5-GFP" ISCs, 
which in turn generates Apc-null Lgr5-GFP!” progenitors (Extended 
Data Fig. 10c). Four days after tamoxifen administration, we isolated 
Apc-null Lgr5-GFP"! ISCs and Lgr5-GFP! progenitors from vehicle 
and PPAR-6-agonist-treated mice to assess the tumour-forming poten- 
tial of these populations using separate assays (Fig. 5f, schematic): first, 
we examined their capacity to give rise to adenomatous organoids 
in culture; and second, we interrogated the ability of 10,000 sorted 
stem and progenitor cells to initiate adenomas in syngeneic recipient 
colons. 

We found that Apc-null ISCs from PPAR-$ agonist-treated mice were 
as clonogenic as those from vehicle controls; however, enforced PPAR-5 
signalling in Apc-null progenitors markedly boosted their ability to 
form adenomatous organoids (Fig. 5g). Next, we assessed the potential 
of freshly isolated ISC and progenitor cells to form in vivo intestinal 
adenomas (Fig. 5f, schematic). As in the organoid assay, the PPAR-6 
agonist had no additive effect on the ability of Apc-null stem cells to 
form 6-catenint (Apc-null) adenomas (Fig. 5h and Extended Data 
Fig. 10a, d) compared to vehicle controls. However, enforced PPAR-6 
signalling permitted Apc-null progenitors, but not their vehicle-treated 
counterparts, to initiate 8-catenin* (Apc-null) adenomas robustly 
after transplantation into recipient colons (Fig. 5h and Extended Data 
Fig. 10b, d). These data indicate that PPAR-6 activation enables a subset 
of non-ISC progenitors to initiate adenomatous growth in vitro and 
in vivo (see Supplementary Information). 


Discussion 

Our data favour a model in which a HFD augments ISC self-renewal 
and bestows features of stemness (that is, organoid-initiating capacity) 
on non-stem-cell progenitors by activating PPAR-4 signalling (Fig. 5i). 
A previous study shows that a different dietary regimen, calorie restric- 
tion, increases both stem and Paneth cell numbers and regulates ISC 
function non-cell autonomously through the Paneth cell niche, with 
no significant effect on progenitor cell function!. By contrast, here we 
find that a long-term HFD has opposing effects on stem and Paneth 
cell numbers, and that these stem cells are less dependent on Paneth 
cells in functional assays. The fact that we find induction of 3-catenin 
targets Jag] and Jag2 (Notch ligands typically elaborated by Paneth cells) 
in HFD stem and progenitor cells suggests a possible role for Notch 
signalling. In a HFD, proximate ISCs or progenitor cells may serve as a 
surrogate source of Notch ligands for Lgr5* ISCs not in direct contact 
with Paneth cells, enabling them to persist in vivo and in the organoid 
assay. A recent report”® that PPAR-6 activation in the bone amplifies 
B-catenin signalling is consistent with our finding that diet-activated 
PPAR-6 engages a restricted 3-catenin program. Notably, genes within 
the PPAR-6-activated B-catenin program include Jag1, Jag2 and Bmp4, 
which are often deregulated in early intestinal tumorigenesis” ~*! 
(Fig. 4c-f). 

Recent studies propose that intrinsic and extrinsic factors con- 
tribute to cancer risk through the control of stem-cell divisions*>*”. 
These models predict that extrinsic factors such as a HFD may raise 
cancer risk by increasing stem-cell divisions, which are the implicated 
cell-of-origin for many cancers. Our data (Fig. 5i) and a previous 
study*? demonstrate that a HFD augments the numbers and prolif- 
eration of ISCs, which may partially account for the increase of intes- 
tinal tumours in this model of obesity. Another possibility raised by 
our results is that a HFD-driven PPAR-6 program also enhances the 
susceptibility of non-ISCs to undergo oncogenic transformation, thus 
establishing a larger and more diverse pool of cells capable of initiat- 
ing tumours. Consistent with this notion, it has been proposed that 
differentiated cells (non-ISCs) in the background of Apc-deficiency 
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with concurrent activation of oncogenic KRAS and pro-inflammatory 
NF-«B signalling have the capacity to initiate tumours’. Whether and 
how obesity-related inflammation in a HFD contributes to PPAR-6 
signalling and intestinal tumorigenesis is unknown. In our models, we 
find no evidence that a HED or its predominant fatty acid constituents 
activate inflammatory pathways in intestinal crypts** or organoids, 
respectively (Extended Data Fig. 3a-e). 

While some previous work indicates that PPAR-6 inhibition may 
have modest anti-cancer effects!”, sustained PPAR-6 activation and 
a HFD have been linked to colorectal cancer initiation and progres- 
sion”*>~*°, Future studies will need to address whether PPAR-6 inhibi- 
tion in the setting ofa HFD affects tumour initiation and progression. 
Lastly, it will be important to explore if lean ketogenic diets, which like 
a HFD are composed largely of fatty acids but with fewer carbohydrates, 
mimic the pro-regenerative effects of a HFD while minimizing the 
untoward sequelae of obesity. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Mice, HED and drug treatment. Mice were housed in the Unit for Laboratory 
Animal Medicine at the Whitehead Institute for Biomedical Research 
and Koch Institute for Integrative Cancer Research. The following strains 
were obtained from the Jackson Laboratory: Lgr5-EGFP-IRES-CreERT2 (strain 
name: B6.129P2-Lgr5™ (cre! ERT2)Cle/J stock number 008875), Rosa26-lacZ (strain 
name: B6.129S4-Gt(ROSA)26Sor'™!5°"/J, stock number 003474), db/db (strain 
name: B6.BKS(D)-Lepr’®/J, stock number 000697), Ppard'" (strain name: 
B6.129S4-Ppard'™!®eY/J, stock number 005897). Apel*? "14 (Apcl//) has been 
previously described*!. Villin-CreERT2 was a gift from S. Robine. Long-term 
HED was achieved by feeding male and female mice a dietary chow consisting 
of 60% kcal fat (Research Diets D12492) beginning at the age of 8-12 weeks and 
extending for a period of 9-14 months. Control mice were sex- and age-matched 
and fed standard chow ad libitum. GW501516 (Enzo) was reconstituted in DMSO 
at 4.5 mg ml! and diluted 1:10 in a solution of 5% PEG400 (Hampton Research), 
5% Tween80 (Sigma), 90% HO for a daily intraperitoneal injection of 4mgkg . 
Apc exon 14 was excised by tamoxifen suspended in sunflower seed oil (Spectrum 
$1929) at a concentration of 10 mg ml’ and 250 11 per 25g of body weight, and 
administered by intraperitoneal injection twice over 4 days before collecting tis- 
sue. Ppard“'" mice were administered 4—5 intraperitoneal injections of tamoxifen 
on alternate days. Mice were analysed within 2 weeks of the last tamoxifen injec- 
tion. BrdU was prepared at 10mg ml! in PBS, passed through a 0.22-\.m filter 
and injected at 100 mgkg™!. 

Immunohistochemistry and immunofluorescence. As previously described', 
tissues were fixed in 10% formalin, paraffin embedded and sectioned. Antigen 
retrieval was performed with Borg Decloaker RTU solution (Biocare Medical) in 
a pressurized Decloaking Chamber (Biocare Medical) for 3 min. Antibodies used: 
rat anti-BrdU (1:2,000 (immunohistochemistry (IHC)), 1:1,000 (immunofluo- 
rescence (IF)) Abcam 6326), rabbit chromogranin A (1:4,000 (IHC), 1:250 (IF), 
Abcam 15160), rabbit monoclonal non-phospho 3-catenin (1:800 (IHC), 1:400 
(IF), CST 8814S), mouse monoclonal 3-catenin (1:200, BD Biosciences 610154), 
rabbit polyclonal lysozyme (1:250, Thermo RB-372-A1), rabbit polyclonal 
MUC2 (1:100, Santa Cruz Biotechnology 15334), rabbit monoclonal OLFM4 
(1:10,000, gift from CST, clone PP7), Biotin-conjugated secondary donkey 
anti-rabbit or anti-rat antibodies were used from Jackson ImmunoResearch. The 
Vectastain Elite ABC immunoperoxidase detection kit (Vector Labs PK-6101) 
followed by Dako Liquid DAB+ Substrate (Dako) was used for visualization. 
For immunofluorescence, Alexa Fluor 568 secondary antibody (Invitrogen) 
was used with Prolong Gold (Life Technologies) mounting media. All antibody 
incubations involving tissue or sorted cells were performed with Common 
Antibody Diluent (Biogenex). Organoids were fixed with 4% paraformaldehyde, 
permabilized with 0.5% Triton X-100 in PBS, rinsed with 100 mM glycine in 
PBS, blocked with 10% donkey serum in PBS, incubated overnight with pri- 
mary antibody at 4°C, rinsed and incubated with Alexa Fluor 568 secondary 
antibody (Invitrogen), and mounted with Prolong Gold (Life Technologies) 
mounting media. 

In situ hybridization. The in situ hybridization probes used in this study corre- 
spond to expressed sequence tags or fully sequenced cDNAs obtained from Open 
Biosystems. The accession numbers (IMAGE mouse cDNA clone in parenthe- 
sis) for these probes are as follows: mouse Olfm4 BC141127 (9055739), mouse 
Crp4 BC134360 (40134597). Both sense and antisense probes were generated to 
ensure specificity by in vitro transcription using DIG RNA labelling mix (Roche) 
according to the manufacturer's instructions and to previously published detailed 
methods**”, Single-molecule in situ hybridization was performed using Advanced 
Cell Diagnostics RNAscope 2.0 HD Detection Kit. 

Radiation and clonogenic microcolony assay. Adult mice were exposed to 15 Gy 
of ionizing irradiation from a 137-caesium source (GammaCell) and euthanized 
after 72h. The number of surviving crypts per length of the intestine was enumer- 
ated from haematoxylin-and-eosin-stained sections’. 

Immunopreciptation and immunoblotting. Antibodies: rabbit polyclonal 
anti-PPAR-6 (1:100, Thermo PA1-823A), rabbit polyclonal anti-CPT la (1:250, 
ProteinTech 15184-1-AP), rabbit polyclonal anti- HMGCS2 (1:500, Sigma 
AV41562), rabbit monoclonal anti- FABP1 (1:1,000, Abcam ab129203), NF-KB 
Sampler Pathway Kit (CST, 9936S), mouse monoclonal anti-STAT-3 (CST, 9139P), 
rabbit monoclonal anti-P-STAT3 (Y705) XP (CST, 9145P), mouse monoclonal 
anti-CREB (CST, 86B10), mouse monoclonal anti-$-catenin (1:200, BD Biosciences 
610154), rabbit polyclonal anti-7y-tubulin (1:1,000, Sigma T5192). For immuno- 
precipitation assays, crypts were collected and nuclear extraction was carried out 
using Abcam nuclear extraction kit (ab113474) following manufacturer's instruc- 
tions. Nuclear extracts were incubated with 5 jig anti-PPAR-6 antibody (Thermo), 
or anti-rabbit IgG control antibody (Santa Cruz) overnight at 4 °C followed by 
2h of incubation with Dynabeads Protein G for immunoprecipitation. Protein 
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complexes bound to antibody and beads were washed five times and eluted with 
Laemmli sample buffer. Samples were resolved by SDS-PAGE. Protein interaction 
was analysed by immunoblotting. 

Lgr5-GFP"™ ISCs or Lgr5-GFP!"’ progenitors were sorted directly into Laemmli 
sample buffer and boiled for 5 min. Samples were resolved by SDS-PAGE and 
analysed by immunoblotting with horseradish peroxidase (HRP)-conjugated IgG 
secondary antibodies (1:10,000, Santa Cruz Biotechnology sc-2054) and Western 
Lightning Plus-ECL detection kit (Perkin Elmer NEL104001EA) 

Flow cytometry and isolation of ISCs, colonic stem cells and Paneth cells. As 
previously reported and briefly summarized here, small intestines and colons 
were removed, washed with cold PBS without magnesium chloride and calcium 
(PBS—/—) opened longitudinally, and then cut into 3-5-mm fragments. Pieces 
were washed several times with cold PBS—/— until clean, washed 2-3 with 
PBS—/— EDTA (10mM), incubated on ice for 90-120 min, and gently shook at 
30-min intervals. Crypts were then mechanically separated from the connective tis- 
sue by more rigorous shaking, and then filtered through a 70-{1m mesh into a 50-ml 
conical tube to remove villus material (for small intestine) and tissue fragments. 
Crypts were removed from this step for crypt culture experiments and embedded 
in Matrigel with crypt culture media. For ISC isolation, the crypt suspensions were 
dissociated to individual cells with TrypLE Express (Invitrogen). Cell labelling 
consisted of an antibody cocktail comprising CD45-PE (eBioscience, 30-F11), 
CD31-PE (Biolegend, Mec13.3), Ter119-PE (Biolegend, Ter119), CD24-Pacific 
Blue (Biolegend, M1/69), CD117-APC/Cy7 (Biolegend, 2BS), and EPCAM-APC 
(eBioscience, G8.8). ISCs were isolated as Lgr5-EGFP"Epcam*+ CD24! ~CD31— 
Terl19-CD45~7-AAD~. EGFP" progenitors were isolated as EGFP!°WEpcam* 
CD24!W/-CD31~Ter119~ CD45~7-AAD~, and Paneth cells from small intestine 
were isolated as CD24"'Sidescatter™Lgr5-EGFP” Epcam*CD31~ Terl1 19° CD45~7- 
AAD~ with a BD FACS Aria II SORP cell sorter into supplemented crypt cul- 
ture medium for culture. Dead cells were excluded from the analysis with the 
viability dye 7-AAD (Life Technologies). When indicated, populations were cyto- 
spun (Thermo Cytospin 4) at 800r.p.m. for 2 min, or allowed to settle at 37°C in 
fully humidified chambers containing 5% CO; onto poly-L-lysine-coated slides 
(Polysciences). The cells were subsequently fixed in 4% paraformaldehyde (pH 7.4, 
Electron Microscopy Sciences) before staining. 

Culture media for crypts and isolated cells. Isolated crypts were counted and 
embedded in Matrigel (Corning 356231 growth factor reduced) at 5-10 crypts per 
jl and cultured in a modified form of medium as described previously’. Unless 
otherwise noted, Advanced DMEM (Gibco) was supplemented by EGF 40ng ml"! 
(R&D), Noggin 200 ng ml! (Peprotech), R-spondin 500 ngml! (R&D or Sino 
Biological), N-acetyl--cysteine 1 |1M (Sigma-Aldrich), N2 1X (Life Technologies), 
B27 1X (Life Technologies), Chiron 10|1M (Stemgent), Y-27632 dihydrochloride 
monohydrate 20 ng ml“! (Sigma-Aldrich). Colonic crypts were cultured in 50% 
conditioned medium derived from L-WRN cells supplemented with Y-27632 
dihydrochloride monohydrate 20 ng ml", as described**. Approximately 25-30 1 
droplets of Matrigel with crypts were plated onto a flat bottom 48-well plate 
(Corning 3548) and allowed to solidify for 20-30 min in a 37°C incubator. Three 
hundred microlitres of crypt culture medium was then overlaid onto the Matrigel, 
changed every 3 days, and maintained at 37°C in fully humidified chambers 
containing 5% CO . Clonogenicity (colony-forming efficiency) was calculated by 
plating 50-300 crypts and assessing organoid formation 3-7 days or as specified 
after initiation of cultures. Palmitic acid (Cayman Chemical Company 10006627 
conjugated to BSA), oleic acid (Sigma 01008), lipid mixture (Sigma L0288), 
or GW501516 (Enzo) were added immediately to cultures at 30 1M (palmitic 
acid, oleic acid), 2% (lipid mixture), and 14M (GW501516). 4-OH tamoxifen 
(Calbiochem, 579002, 10nM) was added to organoid cultures derived from 
Ppard‘'"; Villin-CreERT2 (Ppard IKO) crypts to ensure Ppard excision in the 
ex vivo fatty acid or GW501516 experiments. 

Isolated ISCs or progenitor cells were centrifuged for 5 min at 250g, re- 
suspended in the appropriate volume of crypt culture medium (500-1,000cellsjl~), 
then seeded onto 25-30 il Matrigel (Corning 356231 growth factor reduced) con- 
taining 1|1.M Jagged (Ana-Spec) in a flat bottom 48-well plate (Corning 3548). 
Alternatively, ISCs and Paneth cells were mixed after sorting in a 1:1 ratio, centri- 
fuged, and then seeded onto Matrigel. The Matrigel and cells were allowed to solid- 
ify before adding 30011 of crypt culture medium. The crypt media was changed 
every second or third day. Organoids were quantified on days 3, 7 and 10 of culture, 
unless otherwise specified. 

Secondary mouse organoid assays. For secondary organoid assays, either individ- 
ual primary organoids or many primary organoids were mechanically dissociated 
and then replated, or organoids were dissociated for 10 min in TrypLE Express at 
32°C, resuspended with SMEM (Life Technologies), centrifuged (5 min at 250g) 
and then resuspended in cold SMEM with the viability dye 7-AAD. Live cells 
were sorted and seeded onto Matrigel as previously described in standard crypt 


© 2016 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


media (not supplemented with lipids or GW501516). Secondary organoids were 
enumerated on day 4, unless otherwise specified. 

Human crypt cultures. Human biopsies were obtained from patients with 
informed consent undergoing intestinal resection at the Massachusetts General 
Hospital (MGH). The MGH Institutional Review Board committee and 
Massachusetts Institute of Technology Committee on the Use of Humans as 
Experimental Subjects approved the study protocols. Crypts were isolated’, 
embedded in Matrigel and subsequently exposed to lipid mixture, palmitic acid 
or GW501516 (as described in earlier). Cultures were passaged weekly and main- 
tained for 3-4 weeks. To passage, equal numbers of organoids from each con- 
dition were disrupted with trypsin/EDTA. Numbers of organoids were counted 
4-7 days after passaging into control media. Counts were normalized to numbers 
of organoids present in control wells and plotted. Statistical significance was cal- 
culated by performing analysis of variance (ANOVA) multiple comparisons of the 
means for each group. For quantitative RNA expression analysis, organoids were 
dissociated, cells were selected as a live population by flow cytometry (7-AAD, Life 
Technologies), and sorted into Tri Reagent (Life Technologies) for RNA isolation. 
Electron microscopy. After 5 days of culturing, intestinal organoids were placed 
into Karnovsky’s KI] solution (2.5% glutaraldehyde, 2.0% paraformaldehyde, 
0.025% calcium chloride, in a 0.1 M sodium cacodylate buffer, pH 7.4) and fixed 
overnight. Subsequently, they were post-fixed in 2.0% osmium tetroxide, stained 
en bloc with uranyl acetate, dehydrated in graded ethanol solutions, infiltrated 
with propylene oxide/Epon mixtures, flat embedded in pure Epon, and polymer- 
ized overnight at 60°C. Then 1-1m sections were cut, stained with toluidine blue, 
and examined by light microscopy. Representative areas were chosen for electron 
microscopic study and the Epon blocks were trimmed accordingly. Thin sections 
were cut with an LKB 8801 ultramicrotome and diamond knife, stained with Satos 
lead, and examined in a FEI Morgagni transmission electron microscope. Images 
were captured with an AMT (Advanced Microscopy Techiques) 2K digital CCD 
camera. 

RNA isolation. For RNA sequencing (RNA-seq), total RNA was extracted 
from 200,000 sorted Lgr5-GFP™ ISCs and Lgr5-GFP"™ progenitors by pooling 
2-5 71-week-old HFD male or control mice using Tri Reagent (Life Technologies) 
according to the manufacturer’s instructions, except for an overnight isopropanol 
precipitation at —20°C. From the total RNA, poly(A)* RNA was selected using 
Oligo(dT)25-Dynabeads (Life technologies) according to the manufacturer's pro- 
tocol. 

RNA-seq library preparation. Strand-specific RNA-seq libraries were prepared 
using the dUTP-based, Ilumina-compatible NEXTflex Directional RNA-Seq Kit 
(Bioo Scientific) according to the manufacturer's directions. All libraries were 
sequenced with an Illumina HiSeq 2000 sequencing machine. 

Processing of RNA-seq reads and measuring expression level. For RNA-seq data 
analysis, raw stranded reads (40 nucleotides) were trimmed to remove adaptor and 
bases with quality scores below 20, and reads shorter than 35 nucleotides were 
excluded. High-quality reads were mapped to the mouse genome (mm10) with 
TopHat version 1.4.1 (ref. 44), using known splice junctions from Ensembl Release 
70 and allowing at most two mismatches. Genes were quantified with htseq-count 
(with the ‘intersect strict’ mode) using Ensembl Release 70 gene models. Gene 
counts were normalized across all samples using estimateSizeFactors from the 
DESeq R/Bioconductor package“. Differential expression analysis was also 
performed between two samples of interest with DESeq. GSEA (http://software. 
broadinstitute.org/gsea/index.jsp) was performed by using the pre-ranked (accord- 
ing to their ratios) 8,240 differentially expressed genes as the expression data set. 
Motif Analysis was performed using Haystack motif enrichment tool: http://github. 
com/lucapinello/Haystack**. 

Single-cell gene expression analysis. In total, 24 single Lgr5-GFP™ ISCs and 72 
single Lgr5-GFP! progenitor cells were sorted from control or HFD-fed mice 
(n=2 mice per group) for single-cell gene expression analysis. For one-tube sin- 
gle-cell sequence-specific preamplification, individual primer sets of B-catenin 
target genes (total of 96, Supplementary Table 2) were pooled to a final concentra- 
tion of 0.1 mM for each primer. Single cells were directly sorted into 96-well plates 
containing 541] RT-PCR master mix (2.511 CellsDirect reaction mix, Invitrogen; 
0.5 .l primer pool; 0.1 11 reverse transcriptase/Taq enzyme, Invitrogen; 1.9 jl 
nuclease-free water) in each well. Immediately after, plates were placed on PCR 
machine for preamplification. Sequence-specific preamplification PCR proto- 
col was as following: 60 min at 50°C for cell lysis and sequence-specific reverse 
transcription; then 3 min at 95°C for reverse transcriptase inactivation and Taq 
polymerase activation. cDNA was then amplified by 20 cycles of 15s at 95°C for 
initial denaturation, 15 min at 60°C for annealing and elongation. After pre- 
amplificiation, samples were diluted 1:5 before high-throughput microfluidic 
real-time PCR analysis using Fluidigm platform. Amplified single-cell cDNA 
samples were assayed for gene expression using individual qRT-PCR primers 


and 96.96 dynamic arrays on a BioMark System by following manufacturers pro- 
tocol (Fluidigm). To confirm PPAR-6-mediated induction of the most upregulated 
genes (n= 3 mice, 24 ISCs and 72 progenitors per group), or for single-cell analy- 
sis of organoid composition (n= 3 mice, 48 cells per group) and db/db mice (n= 3, 
48 cells per group) standard single-cell qRT-PCR was performed using preampli- 
fied cDNA with corresponding primers. For Fluidigm analysis, threshold cycle 
(C,) values were calculated using the BioMark Real-Time PCR Analysis software 
(Fluidigm). See Supplementary Information for raw gene expression data. Gene 
expression levels were estimated by subtracting the C, values from the background 
level of 35, which approximately represent the log, gene expression levels. The 
t-Distributed stochastic neighbour embedding (t-SNE) analysis*” was performed 
using the MATLAB toolbox for dimensionality reduction. Differential expression 
analysis was conducted using the two-sided Wilcoxon-Mann-Whitney rank sum 
test implemented in the R coin package (https://www.r-project.org). P values were 
adjusted for multiple testing** using the p.adjust function in R with method = ‘fdr’ 
option. Fold changes were calculated as the difference of median of log, expression 
levels for the two cell populations. Split violin plots were generated using the 
vioplot package and the vioplot2 function in R (https://gist.github.com/ 
mbjoseph/5852613). The heatmap for 3-catenin target genes was generated with 
the MultiExperiment Viewer (MeV) program (http://www.tm4.org/mev.html) 
using the correlation-based distance and average linkage method as parameters of 
the unsupervised hierarchical clustering of genes. The heatmap for organoid com- 
position was generated using MATLAB. The percentages of Jag1/Jag2-upregulated 
cells were calculated based on the number of single cells whose log, expression 
was above 15. 

qRT-PCR. Approximately 25,000 cells were sorted into Tri Reagent (Life 
Technologies) and total RNA was isolated according to the manufacturer's instruc- 
tions with following modification: the aqueous phase containing total RNA was 
purified using the RNeasy plus kit (Qiagen). RNA was converted to cDNA with the 
cDNA synthesis kit (Bio-Rad). RT-PCR was performed with diluted cDNA (1:5) 
in three wells for each primer and SYBR green master mix (Bio-Rad) on Bio-Rad 
iCycler RT-PCR detection system. For organoid experiments, 1,000 live cells were 
sorted and qRT-PCR optimized for low cell numbers (<1,000) was performed 
after sequence specific pre-amplification (CDNA diluted 1:200 in three wells for 
each primer) as described in single-cell gene expression analysis. All (RT-PCR 
experiments were repeated at least three independent times. Primers used are listed 
on Supplementary Table 1. 

Orthotopic transplantation. Apc!!!; Lgr5-EGFP-IRES-CreERT2 mice were 
treated with vehicle or GW501516 for 1 month, and then injected with two 
intraperitoneal doses of tamoxifen. Four days later, Apc-null Lgr5-GFP"™ ISCs and 
Lgr5-GFP"” progenitors were sorted by flow cytometry, as described earlier. For 
primary cell transplantations, 10,000 Apc-null Lgr5-GFP™ ISCs and Lgr5-GFP!°” 
progenitors were resuspended into 90% crypt culture media (as described) and 
10% Matrigel, then transplanted into the colonic lamina propria of C57BL/6 
recipient mice by optical colonoscopy using a custom injection needle (Hamilton 
Inc., 33-gauge, small Hub RN NDL, 16 inches long, point 4, 45 degree bevel, like 
part number 7803-05), syringe (Hamilton Inc. part number 7656-01), and trans- 
fer needle (Hamilton Inc. part number 7770-02). Optical colonoscopy was per- 
formed using a Karl Storz Image 1 HD Camera System, Image 1 HUB CCU, 175 
Watt Xenon Light Source, and Richard Wolf 1.9mm/9.5 Fr Integrated Telescope 
(part number 8626.431). Four injections were performed per mouse. Mice then 
underwent colonoscopy 8 weeks later to assess tumour formation. Colonoscopy 
videos and images were saved for offline analysis. Following sacrifice, the distal 
colons were excised and fixed in 10% formalin, then examined by haematoxylin 
and eosin section to identify adenomas. Histology images were reviewed by 
gastrointestinal pathologists who were blinded to the treatment groups (S.S., 
V.D. and O.H.Y.). 

Statistics and animal models. All experiments reported in Figs 1-5 were repeated 
at least three independent times, except for Figs 3a, 4c, d, which were repeated 
twice. All samples represent biological replicates. For mouse organoid assays, 
2-4 wells per group with at least 3 different mice were analysed. For human orga- 
noid assays, 4 wells per group with 4 different patient samples were analysed and 
experiments were repeated 4 times. All centre values shown in graphs refer to 
the mean. For statistical significance of the differences between the means of two 
groups, we used two-tailed Student's t-tests. Statistical significance in Fig. 3k was 
calculated by performing ANOVA multiple comparisons of the means for each 
group. No samples or animals were excluded from analysis, and sample size esti- 
mates were not used. Animals were randomly assigned to groups. Studies were 
not conducted blinded, with the exception of all histological analyses and Fig. 
5c, h. All experiments involving mice were carried out with approval from the 
Committee for Animal Care at MIT and under supervision of the Department of 
Comparative Medicine at MIT. 
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Extended Data Figure 1 | A HFD alters intestinal morphology and BrdU-positive enterocyte. 1, m, Representative images of Olfm4 (1, n= 3) 
enhances intestinal progenitor proliferation. ag, In comparison to and Crp4 (m, n=6) in situ hybridizations from control and HFD-fed 
mice fed a standard chow, mice on a HFD gained on average 50% mass mice. n, No significant difference in the number of jejunal caspase3*+ 

(a, control: n= 11, HFD: n= 15), had reduced small intestinal mass and cells was detected by immunohistochemistry. Images are representative 
length (b, c, control: n= 11, HFD: n= 15), longer crypts and shorter of three separate experiments (n= 3); arrowheads indicate representative 
villi (e, g, n =3 each), and fewer villus enterocytes (f, n =3). HFD did not caspase3* enterocytes. 0, The HFD chow (Research Diets D12492) 
change the density of crypts (d, n = 3) in the proximal jejunum. provides a higher percentage of kilocalories from fat and conversely a 
The proximal jejunum was defined as the length between 6 and 9 cm lower percentage of kilocalories from protein and carbohydrates compared 
as measured from the pylorus (the distal portion of the stomach). toa standard chow diet (Labdiet RMH3000). Unless otherwise indicated, 
h-k, HFD enhanced BrdU incorporation in ISCs (or crypt base columnar data are mean +s.d. from n independent experiments; *P < 0.05, 

cells) and progenitor cells (or transit-amplifying cells) in the proximal **P < 0.01, ***P < 0.001 (Student's t-tests). Scale bars, 100 1m (g, k, n), 
jejunum (h, n =6) and sorted cell populations (i, n =3) after a 4-h pulse. 50\.m (h, i, k (inset), 1, m) and 20|.m (1, m, insets); two separate fields of 
HED increased the total (j, control: n =4, HFD: n=5) and normalized jejunum (d), and at least 15 crypts (e), 15 villi (f), 10 villi (g), 100 cells (i), 
numbers of BrdU-labelled enterocytes compared to controls after a 24-h 25 villi (j) and 25 crypt-villus units (nm) were counted per sample in each 
pulse. Arrowhead (k) marks the leading edge of migrating independent experiment. 
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Extended Data Figure 2 | A HFD and PPAR-4 signalling have 
minimal effects on enteroendocrine and goblet cell differentiation 
but promote intestinal regeneration after 15 Gy irradiation. 

a, b, Quantification (a, n = 3) of immunostains (b, n = 3) for 
chromogranin A revealed no difference in the numbers of jejunal 
enteroendocrine cells (arrowheads) per crypt-villus unit in HFD-fed 


mice and GW501516-treated mice compared to their respective controls. 


c, d, Quantification (c, n = 4) of Alcian blue/PAS staining (d, n = 4) 
showed no difference in mucinous goblet cells (arrowhead) in HFD-fed 
and GW501516-treated mice compared to their respective controls. 

e, f, A HED increased the number of regenerating crypts as measured by 
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an increased number of crypts containing at least ten Ki67* (a marker 

of proliferation) cells (e, n = 3) or at least one Olfm4’ cell (f, n = 3) per 

5 mm of jejunum by immunohistochemistry (IHC) or in situ hybridization 
(ISH). Arrows indicate Olfm4* crypts. g, Surviving crypt numbers after 
ionizing irradiation-induced (XRT) damage. Arrows denote regenerating 
crypts; asterisks denote aborted crypts (n = 3). Unless otherwise indicated, 
data are mean +s.d. from n independent experiments. Scale bars, 100 1m 
(b, d), 50 jum (e-g) and 20\m (e, f, insets); 50 crypt-villus units per 
sample were analysed (a, c) and approximately 50 crypts (e-g) were 
counted per sample in each independent experiment. 
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Extended Data Figure 3 | A HFD and fatty acids do not activate 
inflammatory pathways in intestinal crypts and organoids, while 
HED and enforced PPAR-6 signalling enhance colonic stem-cell 
function. a, A HFD did not alter the normalized expression levels of 
inflammatory genes from the GSEA Molecular Signature Database 
(MSigDB; signature M6557) data set in ISCs and progenitors. b, A HFD 
did not induce differential expression of ‘inflammatory response’ genes 
from Gene Ontology (GO; 0006954) in ISCs (Lgr5-GEP") or progenitors 
(Lgr5-GFP!”) compared to control. Fold changes of GO inflammatory 
response genes are indicated in red, and fold changes for all other genes 
are indicated in blue. c, HFD did not activate the NF-kB or the STAT-3 
pathways in the intestinal crypt. Total and phosphorylated protein levels 
in crypt lysates were assessed by immunoblots (n= 3). For western 

blot source data, see Supplementary Fig. 1. d, A HFD did not induce 
pro-inflammatory gene expression in ISCs (Lgr5-GFP"’) or progenitors 
(Lgr5-GEP!"). Relative expression levels compared to Actb were 
measured by qRT-PCR (n=5). e, Ex vivo palmitic acid, lipid mixture or 
GW501516 treatment did not induce inflammatory gene expression in 


crypt-derived organoids compared to vehicle. Relative expression levels 
compared to Actb were assessed by qRT-PCR (n= 4, 12 wells per sample 
were analysed). f, A HFD boosted the number of BrdU-labelled cells as 
measured in distal colonic crypts compared to control (control n = 6, 
HED n=5) after a 4-h pulse. g, A HED increased the frequency of colonic 
ISCs (Lgr5-GFP", dark green) and progenitor cells (Lgr5-GFP!™, light 
green) (n=8). h, i, A HED enhanced PPAR-4 (h) and (-catenin (i) target 
gene expression in colonic ISCs and progenitors. Relative expression 
levels compared to Actb were determined by qRT-PCR (n=5, all fold 
changes are significant with P < 0.05). j-m, Colonic crypts derived from 
HFD-fed (j, n= 4; k, n=4) and GW501516-treated (1, n=5; m, n=4) 
mice demonstrated greater primary and secondary organoid-forming 
capacity compared to their respective controls. Representative images: 
day-4 organoids. Unless otherwise indicated, data are mean + s.d. from 

n independent experiments; *P < 0.05, **P < 0.01, ***P < 0.001 (Student’s 
t-tests). Scale bars, 50m (f), 100 j1m (j) and 200 1m (1); 50 crypts per 
sample were analysed (f) in each independent experiment. 
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Extended Data Figure 4 | Characterization of HFD crypt and 
ISC-derived organoids. a, HFD organoids contained higher frequencies 
of ISCs (Lgr5-GEP™) compared to control (n = 3). b, c, Control and HFD 
organoids demonstrated no differences in morphologic ultrastructure 

as seen in 1-\1m sections of control (left) and HFD (right) organoids 
counterstained with Toluidine Blue (b), and electron microscopy images 
of representative control (left) and HFD (right) organoids (c) (n =3). 

d, e, Composition of organoids derived from control (d) and HED (e) 
crypts as assessed by single-cell gene expression analysis. Organoids 

on day 5 contained ISCs (Lgr5 and Olfm4), Paneth cells (Lyz), 
enteroendocrine cells (Chga), and goblet cells (Muc2). Forty-eight live 
cells per group were sorted and single-cell gene expression analysis was 
performed after pre-amplification using corresponding stem-cell and 
lineage primers (see Methods). f, Crypt-derived organoids from control 
or HFD-fed mice included chromogranin A-, mucin 2- and lysozyme- 
positive cells as assessed by immunofluorescence (blue = DAPI, 

red = cell-specific antibody). Images represent two experiments (n = 2). 
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g, Cultured villi from control and HFD-fed mice lack the ability to form 
organoids. Images represent two experiments with 6 wells per sample 
(n=2).h, ISCs from HFD-fed mice contained greater organoid-forming 
potential compared to controls. Arrowheads indicate representative 
organoids at days 4, 7 and 10 of culture (n = 4). i, Individually dissociated 
HED primary organoids that were derived from single ISCs possessed 
more secondary organoid-forming ability than those from controls. 

(n= 4). Representative images: day-4 secondary organoids. j, k, Single- 
cell gene expression analysis revealed that ISCs from both control (j) and 
HED (k) mice can beget Paneth cells (Lyz) within 24h in culture (48 cells 
per group, see Methods). 1, m, Composition of organoids derived from 
control (1) and HED (m) ISCs (Lgr5-GFP""') as assessed by single-cell 
gene expression analysis (48 cells per group, see Methods). Organoids 

on day 5 contained ISCs (Lgr5 and Olfm4), Paneth cells (Lyz), endocrine 
cells (Chga) and goblet cells (Muc2). Unless otherwise indicated, data are 
mean + s.d. from n independent experiments; *P < 0.05 (Student's t-tests). 
Scale bars, 201m (b), 21m (c), 501m (f), 200 jum (g, i) and 1001m (h). 
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Extended Data Figure 5 | Ex vivo exposure of mouse and human crypts (H1-H4) passaged in the presence of lipid mixture, palmitic acid 
organoids to fatty acids recapitulates aspects of a HFD. or GW501516 augmented relative clonogenicity compared to vehicle, as 
a, b, Individually dissociated primary organoids possessed more secondary _ shown in representative images from 4 independent experiments. H1: 
organoid-forming activity (a, n = 4, the mean number of secondary n= 10 (vehicle, palmitic acid, GW501516) and n= 6 (lipid mixture) wells 
organoids subcloned from each of 5 primary organoids in 4 independent were analysed. H2: n= 16 (vehicle), n=6 (lipid), n = 12 (palmitic acid) 
experiments), and contained a higher frequency of Lgr5-GFPM ISCs and n= 14 (GW501516) wells were analysed. H3: n = 10 (vehicle), n= 12 
(b, n=3) after 4 weeks of treatment with 301M palmitic acid compared (lipid, palmitic acid) and n = 8 (GW501516) wells were analysed. H4: n=7 
to vehicle. c, Exposure of naive crypts to 30M oleic acid had no effect (vehicle, GW501516), n=6 (lipid) and n= 9 (palmitic acid) wells were 
on primary organoid formation measured at day 7 (n = 6). Representative analysed. Age, gender and BMI are specified. g-j, Human crypt-derived 
images: day-7 organoids. d, Individually dissociated primary organoids organoids after ex vivo treatment with palmitic acid, lipid or GW501516 
possessed more secondary-organoid-forming capacity after 4 weeks of induced PPAR-5 target gene expression as assessed in passaged cultures 
treatment with 301M oleic acid (n= 4, the mean number of secondary with qRT-PCR (n =4, 12 wells per sample were analysed, all fold changes 
organoids subcloned from each of 5 primary organoids in 4 independent are significant, P< 0.05). Unless otherwise indicated, data are mean + s.d. 
experiments) compared to vehicle (same vehicle cohort used in a and d). from n independent experiments; *P < 0.05, **P <0.01, ***P < 0.001 
e, Lipid mixture composition (Sigma L0288) as described by the (Student’s t-tests). Scale bars, 100 zm (c) and 500 1m (f). 
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Extended Data Figure 6 | PPAR-6 is the predominant PPAR family 
member expressed in intestinal progenitors and mediates the effects 
of HED. a, PPAR- is the most abundant PPAR family member in ISCs 
(Lgr5-GFP") and progenitors (Lgr5-GFP!”) based on RNA-seq data. 

b, Confirmation of PPAR family member mRNA expression levels in 
ISCs (Lgr5-GFP") and progenitors (Lgr5-GFP) by qRT-PCR (n=5). 

c, Genes upregulated in HFD ISCs (Lgr5-GFP") versus control ISCs 

were enriched in PPAR and LXR/RXR motifs. d, GSEA of RNA-seq 

data identified enrichment of PPAR-5 targets in ISCs (Lgr5-GFP") and 
progenitors (Lgr5-GFP!”) with a HED. e, Confirmation of induced 
PPAR-& target gene expression in flow-sorted ISCs (Lgr5-GFP™) and 
progenitors (Lgr5-GFP!”) by qRT-PCR (n =5). All fold changes were 
significant, P< 0.05. f, g, Representative images of Olfm4” (ISCs, f) and 
Crp4* (Paneth cells, g) in situ hybridization from vehicle and GW501516- 
treated mice (f, n = 3; g, n=4). h, Ex vivo exposure of organoids to 
palmitic acid, lipid mixture or GW501516 stimulated PPAR-6 and 
6-catenin target gene expression (n = 3, all fold changes were significant, 
P<0.05). i, j, Injection with tamoxifen (4 injections on alternating days) in 
Ppard“"'; Villin-CreERT2 mice led to efficient intestinal deletion (IKO) of 
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Ppard (7 days after the last tamoxifen dose), as assessed by allele-specific 
deletion PCR (i, n =3) and immunoblot analysis (j, n = 3) of crypts. For 
western blot source data, see Supplementary Fig. 1. k, Acute disruption 
of Ppard (8 days after the last tamoxifen dose) did not perturb ISC and 
progenitor proliferation, as determined 4-h after BrdU administration 
(n= 3).1, m, Acute Ppard deletion (8 days after the last tamoxifen dose) 
did not significantly alter Olfm4* ISCs numbers (L/L: n =5, IKO: n= 4) 
(1) or Crp4* Paneth cell (n =5) (m) numbers, as assessed by in situ 
hybridization. n, Loss of Ppard transcripts in Ppard IKO organoids was 
confirmed by qRT-PCR using deletion-specific primers (n = 3). 

0, PPAR-4 is required for the induction of PPAR-4 and 6-catenin target 
gene expression in secondary organoids after ex vivo palmitic acid, lipid 
or GW501516 treatment (n=5, all fold changes are significant, P < 0.05). 
p, Heat map of differentially expressed genes illustrated induction 

of a PPAR-4 program in HFD-derived ISCs and progenitors relative to 
controls. Unless otherwise indicated, data are mean + s.d. from 

n independent experiments; *P < 0.05 (Student's t-tests). Scale bars, 50 1m 
(f, g, k-m) and 201m (insets); 50 crypts per sample were analysed in each 
independent experiment (f, g, k-m). 
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Extended Data Figure 7 | HFD and PPAR-5 signalling boost nuclear 
6-catenin localization and activity in intestinal progenitors. a, b, HFD- 
derived ISCs (a, Lgr5-GFP™) and progenitors (b, Lgr5-GFP!*") required 
less Wnt3a and R-spondin to initiate organoids than control ISCs, as 
measured by comparing organoid-formation in complete ENRW media, 
which includes EGF, Noggin, R-spondin and Wnt3a, versus EN media, 
which includes EGF and Noggin but lacks Wnt3a and R-spondin (n = 3). 
Control-derived progenitors, in contrast to HFD-derived progenitors, 
rarely formed organoids in either ENRW or EN media. c-f, HFD increased 
nuclear 3-catenin localization in flow-sorted ISCs and progenitors from 
HED (c, n=5) and GW501516-treated (d, n = 4) mice as determined by 
immunofluorescence (red, DAPI; cyan, non-phosphorylated $-catenin, 
CST 8814S). At least 100 cells per sample were quantified. Representative 


IP: IgG 


B-catenin 


images are shown in e and f. g-i, HFD (g) and GW501516 (h) treatment 
increased the numbers of ISCs and progenitors with B-catenin* nuclei, 
as assessed by immunostaining (n = 4 each). Representative images 

are shown in i; arrowheads indicate representative nuclear 3-catenin 

in ISCs (red) and progenitors (black). j-I, Association of PPAR-6 and 
8-catenin in control and HFD-derived intestinal crypts as shown by 
immunoprecipitation (IP) (n = 3). For western blot source data, see 
Supplementary Fig. 1. Unless otherwise indicated, data are mean + s.d. 
from n independent experiments; *P < 0.05, **P < 0.01, ***P< 0.001 
(Student's t-tests). Scale bars, 50 1m (e, f) and 201m (i); organoid assays: 
2-4 wells per sample analysed (a, b), 50 crypts per sample were analysed in 
each independent experiment (g, h). 
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Extended Data Figure 8 | HFD-mediated alterations in 6-catenin 
target gene expression in single ISCs and progenitors. a, Heat map 
representation of }-catenin target gene expression in single ISCs 
(Lgr5-GFP™, 24 cells) and progenitors (Lgr5-GFP™, 72 cells) (see 
Methods). b, Stem-cell signature genes were identified by comparing 
target gene expression in control ISCs (Lgr5-GFP") to control progenitors 
(Lgr5-GFP"’). c, HFD signature genes were identified by comparing 
target gene expression in HFD ISCs to control ISCs (Lgr5-GEP"). 

d, t-Distributed stochastic neighbour embedding (tSNE) analysis of 
single cells using all 8-catenin target genes. e, tSNE analysis of single 
cells using stem-cell signature genes. f, tSNE analysis of single cells using 
HED signature genes. g, h, Lgr5 expression was similar in HFD ISCs 
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(Lgr5-GFP") (g) and progenitors (Lgr5-GFP!”) (h) as compared to their 
respective controls. i, j, HFD increased the percentage of ISCs 
(Lgr5-GFP") (i) and progenitors (Lgr5-GFP!) (j), with increased Jag1 
and Jag2 expression compared to their respective controls. k, 1, HFD 

(k, n=3) and GW501516 treatment (1, n =4) augmented Jag1 expression 
compared to control and vehicle treatments, respectively, as assayed 

by single-molecule in situ hybridization: Jag1 is broadly expressed 
throughout the crypt. Unless otherwise indicated, data are mean + s.d. 
from n independent experiments; *P < 0.05 (Student’s t-tests). Scale 
bars, 501m (k, 1) and 20\m (insets); more than 50 crypts per sample 
were analysed in each independent experiment (k, l). See Supplementary 
Information for raw gene expression data. 
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Extended Data Figure 9 | Characterization of obese db/db mouse 
intestines. a-f, At 4-5 months of age, homozygous db/db mice gained 
on average 50% more mass (a, n = 9), had increased small intestinal mass 
and length (b, c, n= 9), shallower crypts (d, n = 4) and longer villi 

(e, f, n=5) than control db/+ mice. g, h, Immunostains for OLFM4 
(n=6) and lysozyme (n= 6) revealed a slight reduction in the number of 
Olfm4* ISCs and Paneth cells, respectively, in db/db mice compared to 
db/+ controls. i, Organoid-forming capacity of db/db crypts was higher 
(P= 0.095) than db/+ controls (n =7). j, Single-cell gene expression 
analysis revealed no induction of PPAR-6 or 3-catenin target gene 
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Unless otherwise indicated, data are mean + s.d. from n independent 
experiments; *P < 0.05, **P< 0.01, ***P < 0.001 (Student's t-tests). Scale 
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Extended Data Figure 10 | PPAR-6 activation bestows adenoma- 
initiating capacity to Apc-null progenitors. a, b, Representative optical 
endoscopy images (top) from Fig. 5, with H&E (middle) and $-catenin 
(immunohistochemistry, bottom) sections of adenomas derived from 
orthotopic transplantation of Ape-null ISCs (a, Lgr5-GFP™) and 
progenitors (b, Lgr5-GFP!™) from vehicle- and GW501516-treated 
mice 4 days after Apc deletion. Tumours exhibited hyperchromasia, lack 
of maturation, nuclear crowding and nuclear {-catenin positivity. Two 
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independent pathologists blinded to treatment groups interpreted the 
results. c, d, Apc deletion was confirmed in sorted small intestinal ISCs 
and progenitors from vehicle- and GW501516-treated Apc!"; Lgr5- 
EGFP-IRES-CreERT2 mice 4 days after tamoxifen administration (c, n = 3) 
and in isolated tumours (d, n =3) by PCR amplification, using allele- 
specific deletion primers targeting exon 14. Unless otherwise indicated, 


n represents independent experiments. Scale bars (a, b), 501m (20) and 
20\1m (60x). 
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Priming and polymerization of a 
bacterial contractile tail structure 


Abdelrahim Zoued!*, Eric Durand!?*4>*, Yannick R. Brunet!+, Silvia Spinelli’, Badreddine Douzi!?*, Mathilde Guzzo°, 
Nicolas Flaugnatti!, Pierre Legrand’, Laure Journet!, Rémi Fronzes*°, Tam Mignot®, Christian Cambillau** & Eric Cascales! 


Contractile tails are composed of an inner tube wrapped by an outer sheath assembled in an extended, metastable 
conformation that stores mechanical energy necessary for its contraction. Contraction is used to propel the rigid inner 
tube towards target cells for DNA or toxin delivery. Although recent studies have revealed the structure of the contractile 
sheath of the type VI secretion system, the mechanisms by which its polymerization is controlled and coordinated with 
the assembly of the inner tube remain unknown. Here we show that the starfish-like TssA dodecameric complex interacts 
with tube and sheath components. Fluorescence microscopy experiments in enteroaggregative Escherichia coli reveal 
that TssA binds first to the type VI secretion system membrane core complex and then initiates tail polymerization. TssA 
remains at the tip of the growing structure and incorporates new tube and sheath blocks. On the basis of these results, 
we propose that TssA primes and coordinates tail tube and sheath biogenesis. 


Contractile injection machines are nano-structures evolved to deliver 
macromolecules into target cells!. These machines have been elabo- 
rated for different purposes such as the injection of DNA into host 
cells in the case of bacteriophages, for the delivery of protein effectors 
into bacterial or eukaryotic cells in the case of R-pyocins, Photorhabdus 
virulence cassettes, anti-feeding prophages or type VI secretion sys- 
tems (T6SS) or for inducing metamorphosis in invertebrates!~°. These 
machines include a tubular edifice called a tail”. The tail is essentially 
composed of a rigid inner tube wrapped by a contractile structure— 
the sheath—that is assembled in an extended conformation that stores 
mechanical energy necessary for its contraction and to propel the inner 
tube towards the target®. The tail is assembled on the baseplate that var- 
ies in terms of composition and number of subunits; however, a min- 
imal baseplate consists of the hub protein surrounded by wedges!”*. 
The baseplate is not only the platform for the assembly of the tube/ 
sheath, but also an important component of the signalling cascade that 
triggers sheath contraction’. Tails are usually completed by terminator 
proteins that stabilize the sheath and maintain tube and sheath together 
at the distal end to prevent energy dissipation during sheath contraction 
and to permit proper ejection of the inner tube*!”. 

The T6SS is composed of a contractile structure anchored to the cell 
envelope by the TssJLM membrane complex that serves as a docking sta- 
tion as well as a channel for the passage of the inner tube during sheath 
contraction!'-!* (Extended Data Fig. 1a). The contractile structure is 
composed of the tail tube made up of stacks of Hcp hexameric rings, 
wrapped by a sheath-like structure consisting of the TssB and TssC sub- 
units (Extended Data Fig. 1a)'*. During T6SS biogenesis, the assembly 
of the tube and sheath are coordinated: the insertion of a tube ring 
immediately preceding that of a sheath block’. This tail polymerizes on 
a baseplate-like complex composed of the VgrG hub and the TssE, TssF, 
TssG and TssK subunits!®!? (Extended Data Fig. 1a). The TssBC sheath 
polymerizes in tens of seconds to build an ~600-nm long structure that 


contracts in a few milliseconds!”. Contraction of the sheath propels the 
Hep inner tube towards the target cell, like a ‘nano-crossbow’!4, and 
is responsible for the delivery of toxin effectors, as it correlates with 
lysis of the competitor bacterium?*”!. Recent cryo-electron micros- 
copy studies have revealed the atomic structure of the T6SS sheath 
in its contracted conformation”*”>. The sheath is a helical structure 
composed of 6-TssB/TssC heterodimer strands, each heterodimer 
being stabilized by an intra- and inter-strand handshake domain”’. In 
addition, a cryo-electron microscopy study of the pyocin R2 has pro- 
vided information regarding the atomic structure of this contractile 
nanotube in its extended conformation and on how it interacts with 
the inner tube”. Although the general mechanism of T6SS assembly 
and the structure of the T6SS sheath are now well documented, critical 
details are missing, such as how the polymerization of the sheath is 
controlled, how tube and sheath assembly is coordinated and how tail 
polymerization is stopped. 


TssA initiates tail tube/sheath polymerization 

During T6SS tail biogenesis, the recruitment and assembly of Hcp 
hexamers and TssBC sheath blocks should be coordinated and the tail 
tube and sheath should be firmly attached together at the distal end to 
allow proper tube throwing during contraction. We therefore hypoth- 
esized that at least one of the T6SS core proteins must be required to 
coordinate and/or terminate Hcp/TssBC tail assembly. Such candidate 
subunit(s) should interact with both the tube protein (Hcp) and with 
at least one component of the sheath (TssB and/or TssC). We there- 
fore performed a systematic bacterial two-hybrid analysis in which 
Hcp, TssB and TssC were used as baits to identify prey partners within 
T6SS subunits. Extended Data Fig. 1b shows that a number of baseplate 
components (TssE, TssF, TssG and VgrG) interact with either Hcp or 
TssC. However, a unique protein, TssA (GenBank accession number: 
284924261), interacts with both tube and sheath components. Recent 
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Figure 1 | In vivo imaging of sfGFP-TssA. a, TssA localizes in mobile foci. 
Fluorescence microscopy time-lapse recording of wild-type (WT) EAEC 
cells producing sfGFP-TssA. Individual images were taken every 30s. 

The localization of TssA is indicated by the red arrowhead. A schematic 
diagram representating the localization of TssA (from light to dark green 
as a function of time) is shown in the inset. Mean square deplacement 

and kymograph analyses are shown in Extended Data Fig. 3a, b. Scale bar, 
1m. b, TssA forms static foci in absence of the TssBC sheath proteins. 
Fluorescence microscopy time-lapse recording of AtssBC cells producing 
sfGFP-TssA. c, TssA is associated with the distal end of the TssBC sheath 
during elongation. Fluorescence microscopy time-lapse recording of wild- 
type EAEC cells producing sfGFP-TssA and TssB-mCherry. The mCherry 


data showed that TssA is required for proper formation of the Hcp 
tube’’, arguing against a role of TssA as a terminator-only protein. In 
agreement with this conclusion, fluorescence microscopy experiments 
show that T6SS sheathes do not assemble in tssA cells (Extended Data 
Fig. 1c). In addition, Hep tube proteins are not released from fssA cells 
(Extended Data Fig. 1d). Collectively, these data suggest a critical role 
of TssA as a regulator of T6SS tail biogenesis. 

Given that T6SS tube/sheath assembly is initiated on the baseplate 
complex which is docked at the cytoplasmic side of the TssJLM complex’, 
we hypothesized that TssA interacts with baseplate or membrane 
complex components. First, fractionation experiments showed that the 
TssA protein mainly localizes in the cytoplasm but a significant amount 
of the protein is associated with the membrane fraction (Extended Data 
Fig. le). TssA association with the inner membrane is probably dependent 
on the T6SS membrane complex as co-purification and gel-filtration 
experiments showed that TssA binds to the detergent-solubilized 
TssJLM complex (Extended Data Fig. 2a, b). Negative-stain electron 
microscopy of the TssJLM-TssA complex further demonstrated the 
presence of an ~300 A-large complex associated to the cytoplasmic 
base of the TssJLM rocket-like structure (Extended Data Fig. 2c, d). 
To gain further information on TssA localization and dynamics, we 
fused TssA to a fluorescent reporter, the superfolder green fluorescent 
protein (sfGFP). sfGFP was inserted at the tssA locus on the chromo- 
some, to engineer cells producing a functional sfGFP-TssA chimaera 
protein. Time-lapse microscopy recordings showed that TssA does not 
distribute randomly but rather assembles 1-3 discrete foci, located close 
to the membrane (Fig. 1a). Most of these foci are not fixed and show 
directional movement (Extended Data Fig. 3a). Kymographic analyses 
confirmed that TssA foci move with a unidirectional trajectory, at 
a constant velocity (see the schematic representation in the inset of 
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channel (left), GFP channel (middle) and merge channels (right) are 
shown. Individual images (from top to bottom) were taken every 30s. 

The initiation, polymerization and contraction/disassembly stages of the 
TOSS sheath dynamics are indicated on the right, with a schematic diagram 
of the observed events. d, TssA initial localization requires TssM but 

not VgrG or TssL. Fluorescence microscopy time-lapse recording of the 
indicated AverG, AtssL or AtssM cells producing sfGFP-TssA. Individual 
images were taken every 30 s. Scale bar, 1 jum. Recordings of sfGFP-TssA 
in AtssE, AtssF, AtssG, AtssK and Ahcp cells are shown in Extended 
Data Fig. 3e. Large fields, statistical analyses and kymographs analyses 

are shown in Extended Data Fig. 3f-h, respectively. 


Fig. 1a, and the kymograph in Extended Data Fig. 3b). Based on these 
trajectories, we wondered whether TssA foci might be pushed by the 
elongation of the sheath. In AtssBC cells, TssA foci were still assembled 
at close proximity to the membrane, but their dynamics were com- 
pletely abolished: the TssA foci remained static (Fig. 1b and Extended 
Data Fig. 3a). This experiment defined that the TssBC sheath does not 
interfere with TssA recruitment and primary localization but rather 
pushes the TssA cluster during its elongation. We further monitored 
TssB-mCherry and sfGFP-TssA dynamics. Time-lapse recordings 
and kymographic analyses confirmed that TssA-containing complexes 
assemble first close to the membrane and are then pushed by the sheath 
towards the opposite side of the bacterium (Fig. 1c and Extended Data 
Fig. 3b). In addition, fluorescence lifetime imaging microscopy (FLIM) 
assays demonstrated that TssA molecules do not appear to turn over 
between the foci and the intracellular pool, therefore suggesting that 
the same TssA or TssA-containing complex remains at the distal end 
of the sheath (Extended Data Fig. 3c, d). However, because sheath con- 
traction is a very fast event that occurs in a few milliseconds” and is 
immediately followed by ClpV-mediated disassembly”, our experi- 
ments could not define whether TssA remains associated to the sheath 
during contraction. 

To provide insights onto the assembly of the T6SS and particularly 
on the early events before tube/sheath elongation, we tested the local- 
ization and dynamics of sfGFP-TssA in various mutant backgrounds. 
The biogenesis of the T6SS begins with the initial positioning of the TssJ 
outer membrane lipoprotein and progresses with the sequential recruit- 
ment of the inner membrane TssM and TssL proteins!’, and then the 
TssEFGK-VerG baseplate’’. Deletion of baseplate components or Hcp 
did not affect TssA localization but abolished its dynamics (Fig. 1d and 
Extended Data Fig. 3e-h). By contrast, the sfGFP-TssA fluorescence 
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Figure 2 | High-resolution structures of TssA domains. a, X-ray 
structure of the C-terminal domain of TssA (TssAc;) (PDB: 4YO5). Top 
(left panel) and side (right panel) views of the TssAc, dodecamer structure, 
shown in ribbon representation with each monomer differently coloured. 
The inset of the right panel shows the rainbow coloured (blue to red 

from the N terminus) structure of one TssAc; monomer. The consecutive 
a-helices are numbered h1 to h7. The crystal structure of TssAnt2 is shown 
in Extended Data Fig. 6f, g. b, Fitting of the TssAc; (red ribbon) and 
TssAniz (blue ribbon) X-ray structures into the TssA electron microscopy 
reconstruction top (top panel) or side (bottom panel) views (EMD-3282; 
grey volume). Scale bar, 10nm. SAXS and negative-stain electron 
microscopy models of TssA are shown in Extended Data Fig. 5h-j and 
5p-1, respectively. Scale bar, 5nm. 


was diffuse in tssM cells but remained clustered in tssL cells (Fig. 1d 
and Extended Data Fig. 3e-h). Taken together, these results show that 
TssA is recruited in the early stages of T6SS biogenesis, after formation 
of the TssJ-TssM complex (Extended Data Fig. 3i). The interaction of 
TssA with the TssJM complex was further confirmed by co-purification 
experiments (Extended Data Fig. 2a). 

To identify TssA additional partners, the interaction of TssA with 
all the T6SS soluble core components and the soluble domains of the 
TssL, TssM and TssJ proteins, was first tested by a bacterial two-hybrid 
approach. Extended Data Fig. 4a shows that in addition to Hcp and 
TssC, and the previously described TssA-TssK interaction?®, TssA 
interacts with the TssE and VgrG baseplate subunits. To validate these 
results by an alternative approach, native TssA, Hep, TssE, VgrG and 
the TssBC complex were purified and the interactions were assessed 
by surface plasmon resonance (Extended Data Fig. 4b-e). In the 
four cases, we observed interactions between the two partners, with 
dissociation constants ranging from ~2 to ~50,.M (Extended Data 
Fig. 4b-e). The affinities of these complexes are rather low but have 
to be replaced in the context of multiple interactions that probably act 
synergistically. The interaction of TssA with Hcp or TssBC have higher 
affinities compared to that of TssA with VgrG and TssE, whereas TssA 
dissociates more rapidly from VgrG or TssE than from Hcp or TssBC 
(Extended Data Fig. 4b-e). This suggests that the TssE and VgrG base- 
plate proteins may first interact to TssA and that these interactions will 
be displaced by the tube and sheath components (Hcp, TssBC). 

Based on the interaction network and on the fluorescence micros- 
copy recordings, we conclude that positioning of TssA on the mem- 
brane complex recruits the baseplate and initiates polymerization of the 
tube and sheath. However, TssA remains associated with the distal end 
of the polymerized structure during the elongation process. 


Structural organization of the TssA protein 

The electron microscopy analyses of the TssJLM-TssA complex 
(Extended Data Fig. 2d) suggest that TssA assembles an ~300 A-large 
complex. A 6 x His-tagged thioredoxin-TssA fusion (TRX-TssA) 
was produced, purified to homogeneity by ion-metal affinity and size- 
exclusion chromatographies (Extended Data Fig. 5a). Gel-filtration 
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(Extended Data Fig. 5b) and on-line multi-angle laser light scattering/ 
quasi-elastic light scattering/absorbance/refractive index (MALS/ 
QELS/UV/RI, Extended Data Fig. 5c) analyses defined a mass of 
891 kDa, which corresponds to the mass of a TRX-TssA dodecamer 
(theoretical mass = 888 kDa). Crystallization attempts with the purified 
full-length TssA protein obtained after tag cleavage by the TEV protease 
failed. We therefore examined the protein sample using small-angle 
X-ray scattering (SAXS, Extended Data Fig. 5d-j) and electron micros- 
copy after negative staining (Extended Data Fig. 5k-r). The ~19 A 
resolution single-particle reconstruction of TssA from the electron 
micrographs (Extended Data Fig. 5k-n) showed that TssA assembles 
two stacked hexamers with arm-like short extensions (Extended Data 
Fig. 5p-r). However, we noted that the electron microscopy density 
does not account for the complete mass of a TssA dodecamer and 
therefore suspected that the arms may represent a flexible domain, 
shortened by the averaging procedure. This was confirmed by SAXS 
studies demonstrating that TssA is composed of a central hexameric 
core bearing six long arms yielding a starfish-like structure (Extended 
Data Fig. 5h—j and Extended Data Table 1). Whereas the SAXS model 
allows better visualization of the arm length compared to the electron 
microscopy reconstruction, its low resolution impairs the visual sepa- 
ration of the dimeric arms. 

Limited proteolysis of full-length TssA using proteinase K yielded 
three stable fragments of ~45, ~33 and ~19kDa (Extended Data Fig. 6a) 
corresponding to the N-terminal 1-392, C-terminal 221-531 and 
393-531 regions of TssA, respectively (Extended Data Fig. 6b). On the 
basis of this result, two domains corresponding to fragments 1-400 
(TssAnp theoretical molecular weight, 44.2 kDa) and 395-531 (TssAci 
theoretical molecular weight, 15.1 kDa) were designed. Bacterial 
two-hybrid experiments showed that these two domains oligomerize 
independently (Extended Data Fig. 6c). MALS/QELS/UV/RI analyses 
of the purified TssAy;, and TssAc; domains revealed that the N-terminal 
fragment is a dimer in solution (Extended Data Fig. 6d), whereas the 
C-terminal domain is dodecameric (Extended Data Fig. 6e). 

The N-terminal and C-terminal domain crystallized, and 3.37 Aand 
3.35 A resolution data sets were collected, respectively (Extended Data 
Table 2). In agreement with the MALS/QELS/UV/RI data, TssAnt is a 
dimer in the crystal. The peptidic chain starts at residue 221 (instead 
of 1) and stops at residue 377 (instead of 401). This fragment, named 
TssAni2, results from protein cleavage during crystallization. The 
TssAni2 domain consists of three pairs of helices (1-2, 3-4 and 5-6) 
structured as a bundle followed by a seventh helix perpendicular to 
the others (Extended Data Fig. 6f, g). The TssAc; structure features two 
stacked head-to-head hexamers 30 A thick, with an external diameter 
of 100A (Fig. 2a). The overall form of the hexamer is unique and resem- 
bles that of a hexaflexagon, a kaleidocycle with six triangular wedges 
that contact each other via a-helix hinges (Fig. 2a). Docking of these 
X-ray structures onto the starfish-shaped electron microscopy and 
SAXS reconstructions (Fig. 2b and Extended Data Fig. 7a—-n) showed 
that the TssAc; diameter and width coincide exquisitely with the central 
core of TssA, whereas TssAyy2 dimers fit in the arms at close proximity 
to the TssAc, central core. Large density regions remain available at the 
extremity of the arms and probably correspond to TssAnti domains. 

To understand the contribution of the TssA central core and arm 
domains to the T6SS assembly mechanism, we tested interactions 
between these domains and TssA partners. Bacterial two-hybrid analyses 
demonstrated that TssAy; interacts with TssBC and TssE whereas 
TssAc is sufficient to make contacts with Hcp and VgrG (Extended 
Data Fig. 6h). SPR analyses further confirmed the TssAy;-TssBC and 
TssAc,-Hep interactions with Kg values of ~4.2 and ~48 1M, respec- 
tively (Extended Data Fig. 6i, j). 


Closing remarks 

In this work we identified TssA as a partner of both T6SS inner tube 
and sheath. We combined structural, biochemical, functional and 
microscopy analyses to reveal the specific role of TssA subunit during 
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Figure 3 | Model of the assembly of the type VI secretion system. 

a, b, Schematic representation of the different stages of the assembly and 
mechanism of action of the T6SS (from left to right) highlighting the role 
of TssA. The TssA dodecamer (red) is recruited to the T6SS membrane 
complex. A negative-stain electron microscopy image of the TssJLM-TssA 
complex is shown in Extended Data Fig. 2d. TssA then recruits the 
baseplate and initiates polymerization of the tail by the incorporation of 


TOSS biogenesis. TssA is recruited to an early stage of T6SS biogenesis, 
positions the baseplate, and initiates and guides tail tube/sheath 
polymerization. 

A TssA dodecamer is recruited to the TssJM complex and clusters 
in discrete foci that represent the site of assembly of the T6SS tail tube 
and sheath. Previous studies have shown that TssA is not required for 
proper recruitment of TssL to the TssJM complex’, and hence the 
TOSS assembly pathway is branched on TssM (Extended Data Fig. 3i). 
Protein-protein interaction and fluorescence microscopy analyses 
defined that TssA recruits two components of the baseplate (VgrG and 
TssE). Once the baseplate is positioned, the polymerization of the tube/ 
sheath structure is initiated. Further co-localization studies demon- 
strated that the initial static TssA clusters are then pushed towards the 
opposite side of the cell by the elongation of the sheath structure. FLIM 
experiments suggested that the same TssA particle remains associated 
to the distal extremity of the sheath. This result is also in agreement 
with the interaction of TssA with both tube (Hcp) and sheath (TssC) 
components, and with the higher affinity of TssA for these two partners 
compared to the two baseplate components, suggesting that the recruit- 
ment of Hcp and TssBC displaces TssA from the baseplate. 

TssA exhibits interesting structural characteristics. Twelve TssA pro- 
teins assemble a six-fold symmetry starfish-like particle composed of 
a central core bearing six elongated 170 A long arms. The TssA central 
core, corresponding to the C-terminal domain, has a size and a shape 
comparable to that of an Hcp hexamer (Extended Data Fig. 70) and 
interacts with the N-terminal gp27-like VgrG module and Hcp, two 
proteins whose structures have been shown to be superimposable”’, 
suggesting that TssA recognizes the same fold. The TssA arms interact 
with TssE and TssC. TssE is the homologue of gp25, a component of the 
bacteriophage baseplate wedges that assemble around the gp27 hub’. 
Therefore the positioning of the TssA central core on VgrG or Hcp 
allows the arms to contact the outer wedges or sheath rings. Indeed, 
molecular docking of the TssA electron microscopy volume to the 
sheath model in the extended conformation shows that the TssA arms 
are interdigitated with sheath subunits resulting in very complementary 
shapes and tight contacts (Extended Data Fig. 8a, b). Such an efficient 
complementarity cannot be modelled with the contracted tail sheath, 
suggesting that TssA does not bind, or does so much more weakly, to 
this conformation (Extended Data Fig. 8c). 

The TssA C-terminal domain is a hexaflexagon in which the six 
wedges contact each other by hinge-like helices at the outside of the 
structure. We hypothesize that large conformational modifications, 
such as the displacement of these wedges to the exterior, open a large 
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Hcp (black rectangles) and TssBC (blue rectangles) building blocks (a) 
or the incorporation of Hcp-TssBC building blocks (b). During the 
polymerization, TssA remains at the distal end of the structure. TssA is 
released after sheath contraction. Docking experiments of TssA at the 
distal extremity of the extended and contracted sheath are shown in 
Extended Data Fig. 8a—c. A dynamic representation of this working model 
is shown in Supplementary Video 1. 


central lumen. This lumen will have a diameter (~90 A) sufficient to 
accommodate a hexameric Hcp ring. We therefore propose a func- 
tional model in which TssA controls the coordinated polymerization 
of the tube and sheath structure (Fig. 3 and Supplementary Video 1): 
the recruitment of an Hcp ring by TssA will open the lumen allowing 
the incorporation of this Hcp ring to the growing structure. Then the 
TssA arms might be involved in the recruitment and proper position- 
ing of the TssBC strands around this Hep ring, before the insertion 
of a new Hep ring, and so on (Fig. 3a). This model implies that the 
Hcp hexameric unit is incorporated immediately before the TssBC 
sheath building block, a hypothesis consistent with data showing that 
in both bacteriophage T4 and T6SS, tube and sheath grow up from 
the baseplate together, with the tube protein leading and directing 
sheath assembly’ 6 However, the current data cannot rule out that a 
pre-formed Hcp-TssBC complex is recruited to the growing structure 
(Fig. 3b). In these models, the TssA arms secure the sheath under an 
extended state by connecting it to the rigid Hcp tube, explaining how 
the sheath is maintained in a metastable conformation during elonga- 
tion. In bacteriophages, sheath contraction is initiated at the baseplate 
and progresses to the head. We therefore propose that TssA remains 
attached to the distal end during T6SS sheath contraction until the last 
TssBC row contracts to prevent energy dissipation and permit proper 
propulsion of the Hcp tube, before being released. This model is in 
agreement with the docking simulations showing tight contacts of TssA 
with the extended sheath only (Extended Data Fig. 8a, b). The function 
of the TssA protein is therefore different from that of the distal tail pro- 
teins (Dit) of Siphoviridae that prime tube/sheath polymerization but 
remain attached to the baseplate’. It is however closely related to the 
function of the gp3/gp15 proteins of Myoviridae. Although the TssA 
and gp15 folds are highly divergent, the overall architecture of the cen- 
tral core of TssA is similar in terms of size and diameter to gp15 or gp3 
(refs 28, 29) (Extended Data Fig. 8d-f). Notably, the gp3/gp15 proteins, 
also known as tail terminators, do not only complete tail assembly; 
pulse-chase studies of bacteriophage T4 biogenesis demonstrated that 
gp15 is not recruited once the tail tube/sheath has been polymerized 
but, conversely, that it is assembled on the baseplate before the gp19 
tube and gp18 sheath proteins. However, it is found at the opposite 
extremity of the baseplate once the tail is completed and it stabilizes 
the sheath”®. This dynamic is very similar to that observed for TssA, 
that is, first bound to the baseplate but found at the distal end once the 
tail is completed. Taken together these data suggest that TssA and gp15 
may both prime, control the polymerization, complete and stabilize the 
TO6SS and bacteriophage T4 tails, respectively. Finally, according to this 
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model, the T6SS tail tube/sheath grows by the incorporation of building 
blocks at the distal end of the structure, a point that remains to be 
experimentally addressed. Although the newly incorporated subunits 
do not transit through the Hcp lumen, this mechanism is similar to 


that of the flagellar cap complex, which binds to the hook and incorpo- 


rates new flagellin subunits at the distal end by a rotary mechanism*”. 


Defining the TssA conformational modifications that occur during tail 
elongation will shed light on the molecular mechanism of T6SS tail 
assembly and how the sheath is maintained in the extended state. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Bacterial strains, growth conditions and chemicals. The strains, plasmids and 
oligonucleotides used in this study are listed in the Supplementary Table. The 
enteroaggregative E. coli EAEC strain 17-2 and its AtssA, AtssBC, AtssE, AtssF, 
AtssG, AtssK, AtssL, AtssM, Ahcp, AvgrG and tssB-mCherry isogenic derivatives 
were used for this study!!>5. The E. coli K-12 DH5a, W3110, BTH101 and T7 
Iq pLys strains were used for cloning steps, co-immunoprecipitation, bacterial 
two-hybrid and protein purification, respectively. Strains were routinely grown 
in LB rich medium (or Terrific broth medium for protein purification) or in Sci-1 
inducing medium (SIM; M9 minimal medium, glycerol 0.2%, vitamin B1 1pgml"!, 
casaminoacids 100,1g ml!, LB 10%, supplemented or not with bactoagar 1.5%) 
with shaking at 37°C*!. Plasmids were maintained by the addition of ampicillin 
(100,.g ml"! for E. coli K-12, 200j1gml~! for EAEC), kanamycin (501g ml!) or 
chloramphenicol (301g ml 1). Expression of genes from pBAD, pETG20A/pRSF or 
pASK-IBA vectors was induced at A¢o0 nm * 0.6 with 0.02% of L-arabinose (Sigma- 
Aldrich) for 45 min, 0.5-1 mM of isopropyl-3-p-thio-galactopyrannoside (IPTG, 
Eurobio) for 14h or 0.02 jugml? of anhydrotetracyclin (AHT, IBA Technologies) 
for 45 min, respectively. For BACTH experiments, plates were supplemented with 
5-bromo-4-chloro-3-indolyl-3-p-galactopyranoside (X-Gal, Eurobio, 401g ml). 
Strain construction. The tssA gene was deleted into the enteroaggregative 
E. coli 17-2 strain using a modified one-step inactivation procedure* as previously 
described! using plasmid pKOBEG*’. In brief, a kanamycin cassette was ampli- 
fied from plasmid pKD46 using oligonucleotides carrying 50-nucleotide exten- 
sions homologous to regions adjacent to tssA. After electroporation of 600 ng of 
column-purified PCR product, kanamycin-resistant clones were selected and veri- 
fied by colony-PCR. The kanamycin cassette was then excised using plasmid pCP20 
(ref. 32). The deletion of tssA was confirmed by colony-PCR. The same procedure 
was used to introduce the mCherry-coding sequence upstream the stop codon 
of the tssB gene (vector pmCh-KD4 as template for PCR amplification) or the 
sfGFP-coding sequence downstream the start codon (vector pKD4-sfGFP as tem- 
plate) or upstream the stop codon (vector psfGFP-KD4 as template) of the tssA 
gene to yield strains producing TssB-mCherry, sfGFP-TssA or TssA-sfGFP from 
their original chromosomal loci. 

Plasmid construction. All bacterial two-hybrid plasmids and the plasmid 
producing the TssJLM membrane core complex (pRSF-TssJ°" TssL-“"TssM, 
pRSF-TssJLM) have been described previously'*”°. PCR was performed using 
a Biometra thermocycler using the Q5 (New England Biolabs) or Pfu Turbo 
(Agilent Technologies) DNA polymerases. Restriction enzymes were purchased 
from New England Biolabs and used according to the manufacturer's instructions. 
Custom oligonucleotides were synthesized by Sigma Aldrich and are listed in the 
Supplementary Table. Enteroaggregative E. coli 17-2 chromosomal DNA was used 
as a template for all PCR. E. coli strain DH5a was used for cloning procedures. 
All the plasmids (except for pETG20A and pDEST17 derivatives) have been con- 
structed by restriction-free cloning™ as previously described”. In brief, the gene of 
interest was amplified using oligonucleotides introducing extensions annealing to 
the target vector. The double-stranded product of the first PCR has then been used 
as oligonucleotides for a second PCR using the target vector as template. PCR prod- 
ucts were then treated with DpnI to eliminate template plasmids and transformed 
into DH5a-competent cells. For protein purification, the sequences encoding the 
full-length TssA (residues 1-542), the TssAnr (residues 1-392), the TssAnr2 
(residues 221-377) and TssAc; (residues 393-542) domains, the N-terminal 
domain of VgrG (residues 1- 490), the full-length TssE or both TssB and TssC 
were cloned into the pETG-20A (TssA, TssAct, VgrGnt, TssE) or pDEST17 (TssAni 
TssBC) expression vector (gifts from A. Geerlof, EMBL, Hamburg) according to 
standard Gateway protocols. Proteins produced from pETG20A derivatives are 
fused to an N-terminal 6 x His-tagged thioredoxin (TRX) followed by a cleav- 
age site for the Tobacco etch virus (TEV) protease whereas proteins produced 
from pDEST17 are fused to an N-terminal 6 x His tag followed by a TEV protease 
cleavage site. All constructs have been verified by restriction analyses and DNA 
sequencing (Eurofins or MWG). 

Bacterial two-hybrid assay (BACTH). The adenylate cyclase-based bacterial two- 
hybrid technique* was used as previously published”. In brief, the proteins to be 
tested were fused to the isolated T18 and T25 catalytic domains of the Bordetella 
adenylate cyclase. After introduction of the two plasmids producing the fusion 
proteins into the reporter BTH101 strain, plates were incubated at 30°C for 48h. 
Three independent colonies for each transformation were inoculated into 60011 of 
LB medium supplemented with ampicillin, kanamycin and IPTG (0.5mM). After 
overnight growth at 30°C, 10,11 of each culture were dropped onto LB plates supple- 
mented with ampicillin, kanamycin, IPTG and X-Gal and incubated for 16h at 30°C. 
The experiments were done at least in triplicate and a representative result is shown. 


Fluorescence microscopy and image treatment. Fluorescence microscopy 
experiments have been performed essentially as described!*!>7!5. In brief, cells 
were grown overnight in LB medium and diluted to an A¢00 nm * 0.04 into Sci-1 
inducing medium (SIM). Exponentially growing cells (A¢oo nm ¥ 0.8-1) were har- 
vested, washed in phosphate buffered saline buffer (PBS), resuspended in PBS to 
an A¢oo nm * 50, spotted on a thin pad of 1.5% agarose in PBS, covered with a cover 
slip and incubated for one hour at 37°C before microscopy acquisition. For each 
experiment, ten independent fields were manually defined with a motorized stage 
(Prior Scientific) and stored (x, y, z, PFS-offset) in our custom automation system 
designed for time-lapse experiments. Fluorescence and phase contrast micrographs 
were captured every 30s, using an automated and inverted epifluorescence micro- 
scope TE2000-E-PFS (Nikon, France) equipped with a perfect focus system (PFS). 
PFS automatically maintains focus so that the point of interest within a specimen 
is always kept in sharp focus at all times despite mechanical or thermal pertur- 
bations. Images were recorded with a CoolSNAP HQ 2 (Roper Scientific, Roper 
Scientific SARL, France) and a 100 /1.4 DLL objective. The excitation light was 
emitted by a 120 W metal halide light. All fluorescence images were acquired with 
a minimal exposure time to minimize bleaching and phototoxicity effects. The 
sfGFP images were recorded by using the ET-GFP filter set (Chroma 49002) using 
an exposure time of 200-400 ms. The mCherry images were recorded by using the 
ET-mCherry filter set (Chroma 49008) using an exposure time of 100-200 ms. 
Slight movements of the whole field during the time of the experiment were cor- 
rected by registering individual frames using StackReg and Image Stabilizer plugins 
for ImageJ. sfGFP and mCherry fluorescence channels were adjusted and merged 
using ImageJ (http://rsb.info.nih.gov/ij/). For statistical analyses, fluorescent foci 
were automatically detected. First, noise and background were reduced using the 
‘Subtract Background’ (20 pixels Rolling Ball) plugin from Fiji*”. The sfGFP foci 
were automatically detected by a simple image processing: (1) create a mask of cell 
surface and dilate; (2) detect the individual cells using the ‘Analyse particle’ plugin 
of Fiji; (3) sfGFP foci were identified by the ‘Find Maxima process in Fiji. To avoid 
false positives, each event was manually controlled in the original data. Box-and- 
whisker representations of the number of foci per cell were made with R software. 
t-tests were performed in R to statistically compare each population. Kymographs 
were obtained after background fluorescence substraction and sectioning using 
the Kymoreslicewide plug-in under Fiji*”. Fluorescent foci were detected using a 
local and sub-pixel resolution maxima detection algorithm and tracked over time 
with a specifically developed plug-in for ImageJ. The x and y coordinates were 
obtained for each fluorescent focus on each frame. The mean square displacement 
(MSD) was calculated as the distance of the foci from its location at t=0 at each 
time using R software and plotted over time. For each strain tested, the MSD of 
at least 10 individual focus trajectories was calculated. For statistical analyses of 
mobile trajectories, kymograph analyses were performed and the percentage of 
fixed, mobile with random dynamics and mobile with unidirectional trajectory 
foci are reported. 

Fluorescence lifetime imaging (FLIM). FLIM experiments were carried on the 
same microscope device used for the time-lapse microscopy experiments except 
with a laser of 488 nm. For each cell a region of interest that corresponds to the size 
of the laser beam was focused away from TssB-mCherry sheath-labelled sfGFP- 
TssA for a time of 3 s at a maximum intensity of 100%. The extinction of the 
complete sfGFP-TssA pool was checked by (i) the absence of recovery of bleached 
sfGFP-TssA-membrane clusters and (ii) by the overall drop and lack of recovery 
in intracellular intensity. 

Protein purification. E. coli T7 Iq pLysS cells bearing pETG20A or pDEST17 
derivatives were grown at 37°C in Terrific Broth to an Agoo nm © 0.9 and gene 
expression was inducted with 0.5 mM IPTG for 16h at 17°C. Cells were har- 
vested, resuspended in Tris-HCl 20 mM pH 8.0, NaCl 150 mM and lysozyme 
(0.25 mg ml!) and broken by sonication. Soluble proteins were separated from 
inclusion bodies and cell debris by centrifugation 30 min at 20,000g. The His- 
tagged fusions were purified using ion metal Ni** affinity chromatography 
(IMAC) using a 5-ml HisTrap column (GE Healthcare) and eluted with a step 
gradient of imidazole. The fusion proteins were further digested overnight at 
4°C by a hexahistidine-tagged TEV protease using a 1:10 (w/w) protease:protein 
ratio. The TEV protease and contaminants were retained by a second IMAC 
and the purified proteins were collected in the flow through. Proteins were 
further separated on preparative Superdex 200 or Superose 6 gel filtration col- 
umn (GE Healthcare) equilibrated in Tris-HCl 20 mM pH 8.0, NaCl 150 mM. 
The fractions containing the purified protein were pooled and concentrated 
by ultrafiltration using the Amicon technology (Millipore, California, USA). 
The seleno-methionine (SeMet) derivatives of the N- and C-terminal domains 
of TssA were produced in minimal medium supplemented with 100mg]! of 
lysine, phenylalanine and threonine, 50 mg!“ of isoleucine, leucine, valine and 
seleno-methionine. Gene induction and protein purification were performed 
as described above. 
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Limited proteolysis. The full-length TssA protein was subjected to Proteinase K 
limited proteolysis (1:10 protease:protein ratio). The reaction was quenched at dif- 
ferent time points by the addition of 1 mM PMSF and further boiling for 10 min at 
96°C. Samples were analysed by SDS-PAGE and Coomassie blue staining. Digested 
products were identified by Edman N-terminal sequencing and electrospray mass 
sprectrometry (Proteomic platform, Institut de Microbiologie de la Méditerranée, 
Marseille, France). 

Analytical gel-filtration analysis and MALS/QELS/UV/RI-coupled size- 
exclusion chromatography. Size-exclusion chromatography (SEC) was per- 
formed on an Alliance 2695 HPLC system (Waters) using KW803 and KW804 
columns (Shodex) run in Tris-HCl 20 mM pH 8.0, NaCl 150 mM at 0.5 ml per min. 
MALS, UV spectrophotometry, QELS and RI were monitored with MiniDawn 
Treos (Wyatt Technology), a Photo Diode Array 2996 (Waters), a DynaPro (Wyatt 
Technology) and an Optilab rEX (Wyatt Technology), respectively, as described’. 
Mass and hydrodynamic radius calculation were performed with the ASTRA 
software (Wyatt Technology) using a dn/dc value of 0.185 mlg_!. 

Surface plasmon resonance analysis. Steady-state interactions were monitored 
using a BlAcore T200 at 25°C”™. All the buffers were filtered on 0.2 1m membranes 
before use. The HC200m sensor chip (Xantech) was coated with purified Hcp, VgrG, 
TssE or TssBC complex, immobilized by amine coupling (ARU = 4,000-4,300). 
A control flow-cell was coated with thioredoxin immobilized by amine coupling 
at the same concentration (ARU = 4,100). Purified TssA, TssA N-terminal and 
TssA C-terminal domains (six concentrations ranging from 3.125 to 100}1.M) were 
injected and binding traces were recorded in duplicate. The signal from the control 
flow cell and the buffer response were subtracted from all measurements. The dis- 
sociation constants (Ky) were estimated using the GraphPad Prism 5.0 software on 
the basis of the steady state levels of ARU, directly related to the concentration of 
the analytes. The Ky were estimated by plotting on x axis the different concentration 
of analytes and the different ARU at a fixed time (5s before the end of the injection 
step) on the y axis. For Kj calculation, nonlinear regression fit for xy analysis was 
used and one site (specific binding) as a model which corresponds to the equation 
¥=Brmax x x/(Ka +x). 

Co-purification experiments. Different combinations of plasmids were trans- 
formed in BL21(DE3): (i) pRSF-TssJLM + pIBA37(+); (ii) pRSF + pIBA37-"TssA; 
(iii) pRSF-TssJLM + pIBA37-!TssA; and (iv) pRSF-TssJM + pIBA37-!TssA. 
Transformed BL21(DE3) cells were grown at 37°C in 200 ml LB medium supple- 
mented with kanamycin and ampicillin until Agoo nm ¥ 0.6 and gene induction was 
achieved by the addition of IPTG (1 mM) and anhydrotetracycline (0.02 1g ml!) 
during 15h at 16°C. After cell lysis through three passages at the French press, total 
membranes were isolated as described previously'*. Membranes were solubilized 
by the addition of 1% Triton X-100 (Affimetrix). Solubilized membrane fractions 
were purified on a 1 ml Streptactin column (GE Healthcare). The column was 
washed with buffer S (HEPES 50 mM pH 7.5, NaCl 50 mM, Triton X-100 0.075%) 
and bound proteins were eluted with buffer S supplemented with desthiobiotin 
(2.5 mM) and visualized by Coomassie blue staining and immunoblotting. For 
electron microscopy (EM) analyses, BL21(DE3) cells producing TssJLM and Flag- 
tagged TssA were grown and the TssJLM-A complex was purified as described for 
the TssJLM membrane core complex’. After the two consecutive affinity columns 
(His- and Strep-Trap-HP), the pooled fractions were injected onto a Superose 6 
10:300 column equilibrated in HEPES 50 mM pH7.5, NaCl 50mM supplemented 
with 0.025% DM-NPG. 

Electron microscopy observation of the Tss|LMA complex. Nine microlitres 
of the purified TssJLMA complex (~ 0.01 mg ml!) were incubated to glow- 
discharged carbon-coated copper grids (Agar Scientific) for 30s. After absorption, 
the sample was blotted, washed with three drops of water and then stained with 
2% uranyl acetate. Images were collected on an FEI Tecnai F20 FEG microscope 
operating at a voltage of 200 kV, equipped with a direct electron detector (Falcon 
II) at 50,000 magnification. 

Transmission electron microscopy, single particle analysis and image processing. 
Nine microlitres of the purified full-length TssA protein (~ 0.01 mg ml!) were 
incubated on a glow-discharged carbon-coated copper grid (Agar Scientific) 
for 30s. After absorption, the sample was blotted, washed with three drops of 
water and then stained with 2% uranyl acetate. Images were recorded auto- 
matically using the EPU software on a FEG microscope operating at a voltage 
of 200kV and a defocus range of 0.6-25 nm, using a FEI Falcon-II detector 
(Gatan) at a nominal magnification of 50,000, yielding a pixel size of 1.9 A.A 
dose rate of 25 electrons per A? per second, and an exposure time of 1 s were 
used. A total of 100,000 particles were automatically selected from 500 inde- 
pendent images and extracted within boxes of 180 pixels x 180 pixels using 
EMAN2/BOXER™. The CTF was estimated and corrected by phase flipping 
using EMAN2 (e2ctf). All two- and three-dimensional (2D and 3D) classifi- 
cations and refinements were performed using RELION 1.3 (refs 39, 40). The 
automatically selected data set was cleaned up by three rounds of reference-free 
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2D class averaging, and highly populated classes displaying high-resolution 
features were conserved and a final data set of 20,000 particles was assembled. 
An initial 3D-model was generated in EMAN2 using using 30 classes. 3D classifi- 
cation was then performed in Relion with five classes. The particles corresponding 
to most populated class (~18,000) were used for refinement. The Relion auto- 
refine procedure was used to obtain a final reconstruction at ~19 A resolution after 
masking and with Dé symmetry imposed. Reported resolutions are based on the 
gold-standard Fourier shell correlation (FSC) 0.143 criterion; the FSC curve was 
corrected for the effects of a soft mask on the FSC curve using high-resolution 
noise substitution (Extended Data Fig. 50)*". All density maps were corrected for 
the modulation transfer function of the detector and then sharpened by applying 
a negative B-factor (— 1000) that was estimated using automated procedures. The 
electron microscopy map of the EAEC TssA full-length protein has been deposited 
in the Electron Microscopy Data Bank under accession number EMD-3282. 
Small-angle X-ray scattering analysis and ab initio 3D shape reconstruction. 
Small-angle X-ray scattering (SAXS) analyses were performed at the ID29 beamline 
(European Synchrotron Radiation Facility, Grenoble, France) at a working energy 
of 12.5keV (A=0.931 A). Thirty microlitres of protein solution at 1.6, 3.7, 7.1, 9.8 
and 14.9mgml1 in Tris-HCl 20 mM pH 8.0, NaCl 150 mM were loaded by a robotic 
system into a 2-mm quartz capillary mounted in a vacuum and ten independent 
10-s exposures were collected on a Pilatus 6M-F detector placed at a distance of 
2.85 m for each protein concentration. Individual frames were processed automat- 
ically and independently at the beamline by the data collection software (BsxCUBE), 
yielding radially averaged normalized intensities as a function of the momentum 
transfer q, with q=4nsin(6)/A, where 20 is the total scattering angle and \ is the 
X-ray wavelength. Data were collected in the range q=0.04-6nm_!. The ten frames 
were combined to give the average scattering curve for each measurement. Data 
points affected by aggregation, possibly induced by radiation damage, were excluded. 
Scattering from the buffer alone was also measured before and after each sample 
analysis and the average of these two buffer measures was used for background 
subtraction using the program PRIMUS” from the ATSAS package’. PRIMUS was 
also used to perform Guinier analysis“ of the low q data, which provides an estimate 
of the radius of gyration (Rg). Regularized indirect transforms of the scattering data 
were carried out with the program GNOM* to obtain P(r) functions of interatomic 
distances. The P(r) function has a maximum at the most probable intermolecular 
distance and goes to zero at Dax, the maximum intramolecular distance. The values 
of Dax Were chosen to fit with the experimental data and to have a positive P(r) 
function. Three-dimensionnal bead models that fit with the scattering data were 
built with the program DAMMIF*. Ten independent DAMMIF runs were per- 
formed using the scattering profile of TssA, with data extending up to 0.35nm “|, 
using slow mode settings, assuming P6 symmetry and allowing for a maximum 500 
steps to grant convergence. The models resulting from independent runs were super- 
imposed using the DAMAVER suite’ yielding an initial alignment of structures 
based on their axes of inertia followed by minimisation of the normalized spatial 
discrepancy (NSD)**. The NSD was therefore computed between a set of ten inde- 
pendent reconstructions, with a range of NSD from 0.678 to 0.815. The aligned 
structures were then averaged, giving an effective occupancy to each voxel in the 
model, and filtered at half-maximal occupancy to produce models of the appropri- 
ate volume that were used for all subsequent analyses. All the models were similar 
in terms of agreement with the experimental data, as measured by DAMMIF x 
parameter and the quality of the fit to the experimental curve (calculated 
.{X = 1.774). The SAXS data parameters are provided in Extended Data Table 1. 
Crystallization, data collection, processing and refinement. Seleno-methionine 
(SeMet)-labelled TssAy; and TssAc crystallization trials were carried out by the 
sitting-drop vapour diffusion method in 96-well Greiner crystallization plates at 
20°C, using a nanodrop-dispensing robot (Cartesian Inc.). Crystals of SeMet- 
labelled TssAc; grew in a few days after mixing 300 nl of protein at 4.7mgml! 
with 100 nl of 20% PEG 8000, 0.2 M calcium acetate, 0.1 M MES pH6.8. Crystals 
of SeMet-labelled TssAx; grew in a few days after mixing 300 nl of protein at 
4.7mg ml! with 100 nl of 29% PEG 3350, 0.1 M HEPES pH7.5. Crystals were 
cryoprotected with mother liquor supplemented with 20% polyethylene glycol and 
flash frozen in liquid nitrogen. Data sets were collected at the SOLEIL Proxima 1 
beamline (Saint-Aubin, France). After processing the data with XDS”, the scaling 
was performed with SCALA and the structures were solved using the SHELXD 
program”. The structure was refined with AutoBUSTER® alternated with model 
rebuilding using COOT™. The final data collection and refinement statistics are 
provided in Extended Data Table 2. The Ramachadran plots of the TssAy; and 
TssAc; structures exhibit 90.7/3.3 and 91.8/2.9 residues in the favoured and outlier 
areas, respectively. Figures were made with PyMOL*. 

Tail sheath modelling and TssA docking to contracted and extended sheaths. 
The tail sheath modelling was performed using the Vibrio cholerae Vip AB (TssBC) 
complex as starting structure” (PDB: 3J9G) and the contracted tail sheath struc- 
tures of Vibrio cholerae?*. To date, however, the molecular structure of the extended 
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(non-contracted) sheath is not available. In a recent paper, a low-resolution model 
of the extended VipAB sheath was modelled using the low-resolution EM map 
of the extended T4 phage tail sheath’. By superimposing the VipAB EM map to 
the gp18 bacteriophage T4 sheath protein structure, gross features of the sheath 
structure were obtained”. A similar approach was applied with Chimera™ using 
the VipAB molecular model in the extended T4 phage tail sheath instead of using 
the low-resolution VipAB EM map, yielding a model similar to that of Kube et al.”’, 
but with molecular details. The sheath internal channel diameter shrinks from 110 
to ~95 A, and the external diameter from ~290 A to ~190 A. The internal diameter 
of the tail sheath makes it possible to fit stacked Hcp hexamers that are in contact 
with the tail sheath internal wall. Both extended and contracted tail sheath confor- 
mations were used to explore the faisability of sheath complexes with TssA using its 
EM map. TssA being at the distal end of the sheath, the polarity of the sheath was 
taken into consideration. It was suggested that the polarity of T6SS tail sheath is 
similar to that of bacteriophage T4 and therefore that the VipA (TssB) N-terminal 
and VipB (TssC) C-terminal helices point to and contact the baseplate*!. TssA 
was therefore docked at the opposite extremity of the tail sheath using Chimera™. 
Miscellaneous. Hcp release'° and fractionation assays!!!*”5 have been per- 
formed as previously described. SDS-polyacrylamide gel electrophoresis was per- 
formed using standard protocols. For immunostaining, proteins were transferred 
onto 0.2-1m nitrocellulose membranes (Amersham Protran), and immunoblots 
were probed with primary antibodies, and goat secondary antibodies coupled to 
alkaline phosphatase, and developed in alkaline buffer with 5-bromo-4-chloro-3- 
indolylphosphate and nitroblue tetrazolium. The anti-TolB polyclonal antibodies 
are from our laboratory collection, while the anti-Flag (M2 clone, Sigma Aldrich) 
and anti-EFTu (Roche) monoclonal antibodies and alkaline phosphatase-conjugated 
goat anti-rabbit or anti-mouse secondary antibodies (Beckman Coulter) have been 
purchased as indicated. 

Accession numbers. Coordinates and structure factors have been deposited in 
the Protein Data Bank under accession numbers 4YO3 and 4YO5 for TssAnt 
and TssAc;, respectively. Electron microscopy map for full-length TssA has been 
deposited in the Electron Microscopy Databank (EMDB) under accession code 
EMD-3282. 
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Extended Data Figure 1 | Hcp and TssC interact with TssA, a 
cytoplasmic protein required for sheath assembly and Hcp release. 

a, Schematic representation of the architecture of the bacterial type 

VI secretion system. The scheme highlights the membrane complex 
anchoring the tail structure composed of the assembly baseplate, the 
spike, the tube and the sheath (cyto, cytoplasm; IM, inner membrane; 
PG, peptidoglycan layer; OM, outer membrane). b, Bacterial two-hybrid 
assay. BTH101 reporter cells producing the indicated proteins or domains 
(TssLc, cytoplasmic domain of the TssL protein; TssMc and TssMp, 
cytoplasmic and periplasmic domain of the TssM protein respectively) 
fused to the T18 or T25 domain of the Bordetella adenylate cyclase 

were spotted on plates supplemented with IPTG and the chromogenic 
substrate X-Gal. Interaction between the two fusion proteins is attested 
by the dark blue colour of the colony. The TolB-Pal interaction serves as 
a positive control. c, The absence of TssA prevents T6SS sheath dynamics. 
Fluorescence microscopy time-lapse recordings showing sheath dynamics 
using the chromosomally-encoded tssB-mCherry fusion in wild-type 
(WT) (tssB-mCherry pBAD33), AtssA (AtssA tssB-mCherry pBAD33) 
and complemented AtssA (AtssA tssB-mCherry pBAD33-TssAysy-c) 
cells. Individual images were taken every 30s. Assembly and contraction/ 
disassembly events are indicated above the time-lapse images. The scale 


bars are 1 jum. d, The absence of TssA prevents Hcp release. Hcp release 
was assessed by separating whole cells (C) and supernatant (SN) fractions 
from 17-2 (WT), AtssA (AtssA pBAD33, tssA) and complemented AtssA 
(AtssA pBAD33-TssAvsy-c; tssA WT) cells producing Flag-epitope-tagged 
Hep. A total of 1 x 108 cells and the TCA-precipitated material from 

the supernatant of 5 x 10° cells were analysed by western blot using 
anti-Flag monoclonal antibody (lower panel) and anti-TolB polyclonal 
antibodies as a lysis control (upper panel). The molecular weight markers 
(in kDa) are indicated on the left. The uncropped scans of the western 
blots are provided in the Supplementary Figure. e, TssA co-fractionates 
with cytoplasmic and membrane proteins. A fractionation procedure was 
applied to EAEC AtssA cells producing Flag-tagged TssA. Whole cells 
(T) were fractionated to isolate the supernatant (SN), periplasmic (P), 
cytoplasmic (C) and total membrane (M) fractions. Extracts from 

10° (T) or 2 x 10? (SN, P, C, M) cells were separated by SDS-PAGE 

and immunodetected with anti-Flag monoclonal (TssA), anti-EF-Tu 
(cytoplasmic marker) and TolB (periplasmic marker) antibodies. 

The molecular weight markers (in kDa) are indicated on the left. The 
uncropped scans of the western blots are provided in Supplementary 
Figure 1. 
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Extended Data Figure 2 | Purification and negative-stain electron b, Superose 6 10/300 gel-filtration profile of the purified TssJLM-TssA 


microscopy analyses of the TssJLM-TssA complex. a, TssA interacts complex. The asymmetry of the peak probably reflects the co-purification 
with the TssJM complex. The total solubilized membrane extract (T) of of different complexes or the dissociation of TssA from the TssJLM 

4 x 10° cells producing the indicated proteins was subjected to affinity complex. c, Examples of representative raw particles observed for the 
chromatography using streptactin resin. Bound proteins (E) were purified Tss]LM-TssA complex sample using negative-stain electron 


separated by SDS-PAGE and immunodetected with anti-Flag (TssA and microscopy. A typical TssJ[LM complex is shown in red (number of 
TssL), anti-strep tag (TssJ) and anti-5 x His (TssM) monoclonal antibodies. particles observed n = 240) whereas a TssA-bound TssJLM complex is 
The molecular weight markers are indicated on the left. The uncropped shown in white (n = 95). Scale bar is 10nm. d, Magnification of the two 
scans of the western blots are provided in Supplementary Figure 1. complexes shown in c. Scale bars, 10 nm. 
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Extended Data Figure 3 | TssA localization and dynamics. a, Mean 
square displacement (MSD; in arbitrary units (a.u.)) of a representative 
sfGFP-TssA clusters in a wild-type strain (red line) or its AtssBC isogenic 
derivative (black line) were measured by sub-pixel tracking of fluorescent 
foci and plotted over time (in sec). b, Kymographic analysis reporting 
representative sfGFP-TssA (green) and TssB-mCherry (red) positions 
within the cell as a function of time. c, Representative fluorescence lifetime 
imaging microscopy (FLIM) of sfGFP-TssA clusters in the sfGFP-TssA/ 
TssB-mCherry strain. A membrane-associated sfGFP-TssA cluster was 
chosen to define the bleached area (red circle). The laser (488 nm) was 

set to maximum power and focused for 3 s to ensure complete bleaching 
of the GFP diffusible pool. Images were taken every 30s to follow 
recovery dynamics. The scale bar is 1 jum. d, Quantification of sfGFP- 
TssA fluorescence dynamics over time after bleaching. The dynamics 

of fluorescence intensity is shown over time for n = 10 independent 
sfGFP-TssA foci after FLIM (blue line). The fluorescence intensity of the 
bleached focus was also followed over time (FRAP, red line). As a control 
for laser focusing and intensity, membrane-associated clusters were 
systematically bleached in these experiments and showed no recovery 
suggesting the total intracellular sfGFP-TssA has been bleached by the 
laser. e, Representative fluorescence microscopy time-lapse recordings of 
the indicated AtssK, AtssE, AtssF, AtssG or Ahcp cells producing sfGFP- 
TssA. Individual images were taken every 30s. Red arrowheads indicate 
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the localizations of TssA foci. The scale bar is 1 jum. f, Representative 

large fields of fluorescence microscopy analyses showing localization of 
sfGFP-TssA in the indicated strains. The scale bars are 11m. g, Box-and- 
whisker plots of the measured number of sfGFP-TssA foci per cell for each 
indicated strain. The lower and upper boundaries of the boxes correspond 
to the 25% and 75% percentiles respectively. The black bold horizontal bar 
represents the median values for each strain and the whiskers represent the 
10% and 90% percentiles. Outliers are shown as open circles. A Student's 
t-test was used to report significant differences (ns, not significant; 

**P < 0.0001). The number of cells studied per strain (n) is indicated on 
top. h, Statistical analyses of sfGFP-TssA dynamics. sfGFP-TssA dynamics 
were categorized as ‘fixed, ‘mobile with unidirectional trajectory and 
‘mobile with random dynamics’ and the number of sfGFP-TssA (n, on 
top) foci in each category is represented as a percentage for each indicated 
strain. Kymographs for the two first categories are shown at the bottom of 
the panel. i, Schematic representation of the assembly pathway of the T6SS 
based on this study and available data!?!>!94555, The biogenesis starts 
with the initial positioning of the TssJ outer membrane lipoprotein and the 
sequential recruitment of the indicated subunits (from left to right). The 
recruitment of TssA is dependent on TssM, and that of TssK is dependent 
on both TssL and TssA. The exact positions of VgrG and TssE (blue) in the 
pathway are not known but these two subunits are not required for TssA 
recruitment but necessary for Hcp and TssBC polymerization. 
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Extended Data Figure 4 | TssA interacts with tail and baseplate 
components. a, TssA interaction network identified by bacterial 
two-hybrid analysis (see legend to Extended Data Fig. 1b). b-e, Surface 
plasmon resonance interaction study of TssA with its partners identified 
by BACTH. Sensorgrams (variation of plasmon resonance in arbitrary 
units (ARU) as a function of reaction time (in sec)) were recorded upon 
injection of the purified native TssA protein (concentrations of 3.125 


(dark grey), 6.25, 12.5, 25, 50 and 100 (light grey) 1M) on HC200m chips 
coated with the purified N-terminal domain of VgrG (b), purified TssE (c), 
Hcp (d) or TssBC complex (e) (upper panels). The graph reporting ARU 
as a function of the TssA concentration (lower panel) was used to estimate 
the indicated apparent dissociation constants (Kq). Off-rates (percentage 
of dissociation 400s after ligand injection) are indicated. 
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Extended Data Figure 5 | See next page for caption. 
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Extended Data Figure 5 | TssA oligomerization and SAXS and EM 
structural models. a, 101g of purified TssA were analysed by SDS-PAGE 
and Coomassie blue staining. The molecular weight markers (in kDa) are 
indicated on the left, and TssA and its theoretical size are indicated on the 
right. b, Superose 6 10/300 gel filtration profile of purified TssA (black 
line) and protein markers of known size (coloured lines). c, MALS/QELS/ 
UV/RI analysis of purified TssA. The molecular mass of the TssA complex 
is indicated. d-j, Low-resolution SAXS model of TssA. d, Experimental 
scattering data calculated from an ab initio model of TssA. The square root 
x value of the ‘best representative’ model is indicated. e, Representation of 
the Guinier plot calculated from the experimental curve. f, Pair distance 
distribution. g, Kratky plot representative of a multi-domain protein with 
flexible linkers. h-j, SAXS envelope of the ‘best representative’ model 

of TssA, with top (h), side (i) and tilted (j) views. The scale bar is 100 A. 


k-r, Low-resolution EM model of TssA. k, Representative micrograph 

of the data set used for image processing. White circles indicate isolated 
TssA dodecamers. 1. Representative selected TssA particles. m, n, Gallery 
of representative top (m) and side (n) class averages generated after 
reference-free 2D classification using Relion*’. 0, Fourier shell correlation 
(FSC) curve of the TssA reconstruction. The gold standard FSC curve was 
calculated in Relion*® using the masked reconstruction of TssA. p-r, Top 
(p), side (q) and tilted (r) views of the three-dimensional reconstruction 
model of the TssA dodecamer obtained by electron microscopy 
(accession number: EMD-3282). The scale bar is 50 A. Whereas the 

SAXS model allows to better visualize the arm length compared to the 
EM reconstruction, its low resolution impairs the visual separation of the 
dimeric arms. 
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Extended Data Figure 6 | See next page for caption. 
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Extended Data Figure 6 | Identification, oligomerization and 
interaction analysis of TssA domains. a, Limited proteolysis of purified 
TssA. The purified full-length TssA protein (first lane) was submitted 

to proteinase K limited proteolysis for the time indicated on top of each 
lane and analysed by SDS-PAGE and Coomassie blue staining. Stable 
fragments are indicated on the right with their boundaries (numbers 
identified in the sequence in b) and the corresponding fragment. 

The uncropped scan of the Coomassie blue stained gel is provided in 
Supplementary Figure 1. b, TssA protein sequence. The localization of the 
boundaries of the stable fragments obtained after Proteinase K limited 
proteolysis and electrospray mass spectrometry analyses are arrowed. 
The secondary structures observed in the crystal structures (Fig. 2a 

and Extended Data Fig. 6f, g) are indicated on top of the corresponding 
sequence. c, Bacterial two-hybrid analysis of TssAn; and TssAcy 
interactions (see legend to Extended Data Fig. 1b). d, e, MALS/QELS/UV/ 
RJ analysis of the purified TssAx; (d) and TssAc; (e) fragments. f, g, X-ray 


structure of the TssAyj2 domain (PDB: 4YO3). The rainbow coloured 
ribbon representation of the TssAy; monomer is shown (f, consecutive 
a-helices numbered a1 to «7) whereas the dimeric structure (g) highlights 
the helices at the interface (a1, «2 and a6). h, The TssA central core 
interacts with Hcp and VgrG whereas the TssA arms interact with TssE 
and TssC. Bacterial two-hybrid analysis of TssAy; and TssAc; interactions 
(see legend to Extended Data Fig. 1b). i, j, Surface plasmon resonance 
interaction study of the purified TssAc (i) or TssAn; (j) domains with 

the Hcp protein (i) or the TssBC complex (j). Sensorgrams (variation of 
plasmon resonance in arbitrary unit (ARU) as a function of reaction time 
(in seconds)) were recorded upon injection of the purified TssA C-terminal 
(i) or TssAyy (j) domains (concentrations of 3.125 (dark grey), 6.25, 12.5, 25, 
50 and 100 (light grey) jsM) on HC200m chips coated with the purified Hcp 
protein (i) or the TssBC complex (j) (upper panels). The graph reporting 
ARU as a function of the TssA domain concentration (lower panel) was used 
to estimate the indicated apparent dissociation constants (Ka). 
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Extended Data Figure 7 | See next page for caption. 
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Extended Data Figure 7 | Comparison of the SAXS, EM and X-ray 
structures of TssA. a, Schematic representation and colour code of 

the constructs used for SAXS (grey), electron microscopy (light blue) 

and X-ray (TssAn, dark blue; TssAc,, red) analyses. The epitopes 

and theoretical molecular masses of the domains are indicated. 

TRX, thioredoxine; N, N terminus; C, C terminus. b, Fit between the 
experimental data (green dots) and the calculated scattering curves for 
TssAni2 and TssAc; generated by CRYSOL (red line). c-f, SAXS/X-ray 
comparison. Top (c), side (d) and bottom (f) views of the fitting of TssAni2 
(blue ribbon) and TssA¢ (red ribbon) X-ray structures into the TssA SAXS 
envelope (transparent grey surface). Scale bars are 10 nm. e, Magnification 
of a cut-away section of the fitting shown in d. Scale bar is 5nm. 

g-i, SAXS/EM/X-ray comparison. Top (g) and side (h) views of the 
superimposition of SAXS (grey surface), EM (transparent light-blue 


surface) and X-ray structures of TssA. Scale bars are 10 nm. i, Magnification 
of a cut-away section of the superimposition shown in h. j-n, EM/X-ray 
comparison. Top (j), side (k) and bottom (1) views of the fitting of 
TssAnv (blue ribbon) and TssAc (red ribbon) X-ray structures into the 
TssA EM envelope (transparent grey surface). Scale bars are 10 nm. 

m, n, Magnifications of the top and bottom views of the docking of the 
TssA domain X-ray structures into the TssA EM map highlighting the 
interface between the TssA central core (TssAc;, red ribbon) and arms 
(TssAnv, blue ribbon). The C-terminal helix of TssAnt2 (ends at position 
377) and N-terminal helix of TssAc; (starts at position 395) are shown in 
yellow. 0, Top view of the fitting of the X-ray structure of EAEC Hcp 
(green ribbon, PDB 4HKH”*) into the TssA SAXS envelope (grey surface). 
The scale bar is 10 nm. 
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Extended Data Figure 8 | Models of tail-sheath TssA complexes and 
comparison between the bacteriophage T4 gp15 and T6SS TssA 
subunits. a, b, Surface (a) and cross-section (b) views of the complex of 
TssA (EM map, blue) with the extended tail sheath model (the last four 
rows shown in different colours). In the cut-away section, four stacks 

of Hcp rings are visible. As shown by bacterial two-hybrid and SPR 
analyses, Hcp contacts the TssAc; central core whereas TssBC contacts 
the TssAy; arms. The TssA arms fit between the TssBC monomers of the 
last row. c, Surface view of the complex of TssA (EM map, blue) with 
the contracted tail sheath model (the last four rows shown in different 
colours), highlighting the loose packing between TssA and the tail sheath 
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d bacteriophage T4 


Meg Ye, 
1 eo 
Cy 


in this conformation, suggesting that TssA might dissociate after sheath 
contraction. d-f, Comparison between the bacteriophage T4 gp15 and 
TOSS TssA subunits. d, e, Schematic representations of the bacteriophage 
T4 tail distal end comprising the gp19 tube (grey) and gp18 sheath (blue) 
proteins and the gp3 (green) and gp15 (red) neck proteins (d) and the 
T6SS tail distal end comprising the Hcp tube (grey) and TssBC sheath 
(blue) proteins and the TssA dodecamer (red) (e). The possibility that 

a functional homologue of bacteriophage T4 gp3 exists is shown by the 
question mark. f, Fitting of the model of the gp15 structure in complex 
with the last row of the gp18 sheath (in purple)? in the TssA SAXS 
envelope (grey surface). 
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Extended Data Table 1 | SAXS parameters of TRX-TssA 


Facilities and parameters 
Data Collection Parameters 
Beam line 

Wavelength (A) 

Detector 

q range (nm) 

Exposure time (s) 
Concentration range (mg/ml) 
Temperature (°C) 


Structural Parameters 
R, from Guinier fitting (A) 
1(0) from Guinier fitting 
R, from GNOM (A) 

1(0) from GNOM 

Dmax (A) 

Voroa ftom PRIMUS (A) 
Mw? ed (kD a)* 
MWSECMALS (Da)? 
MWS (kDa) 

NSD of DAMMIF models 


Software Employed 
Primary Data Processing 
P(r) 

Ab initio Shape Analysis 
Validation and averaging 
SAXS Profile computation 
Molecular Visualization 


amwrred theoretical mass of the TRX-TssA protein. 


bMWSEC-MALS mass measured by SEC-MALS experiments. 


°MWSAXS, mass calculated from the /(0). 


Settings and values 


ESRF (Grenoble, France) BM29 
0.992 

Pilatus 1M 

0.028-4.525 

1 (10 x 10 sec) 

1.6-14.9 

20 


107.1 £ 0.25 
615.49 + 1.16 
114.5 +0.055 
629.7 + 1.54 
420 + 0.093 
3081.4 x10° 
891 

952 

629 
0.678-0.815 


PRIMUS 

GNOM 

DAMMIF 
DAMAVER 
CRYSOL 
CHIMERA; PyMol 
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Extended Data Table 2 | Data collection, phasing and refinement statistics for SAD (SeMet) structures 


Data collection 
Space group 
Cell dimensions 
a, b,c (A) 
a, By (°) 


Wavelength 
Resolution (A)* 
Rrerge* 

I/ol* 
Completeness (%)* 
Redundancy* 


Refinement 

Resolution (A)* 

No. reflections* 

Rwork/ Rireet 

No. atoms 
Protein 
Ligand/ion 
Water 

B-factors 
Protein 
Ligand/ion 
Water 

R.m.s deviations 
Bond lengths (A) 
Bond angles (°) 


*One crystal used for data collection. 


«Highest resolution shell is shown in parentheses. 
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Structural basis of outer membrane 
protein insertion by the BAM complex 


Yinghong Gu!*, Huanyu Li!*, Haohao Dong!*, Yi Zeng!*, Zhengyu Zhang"*, Neil G. Paterson?*, Phillip J. Stansfeld**, 
Zhongshan Wang'*°, Yizheng Zhang”, Wenjian Wang® & Changjiang Dong! 


All Gram-negative bacteria, mitochondria and chloroplasts have outer membrane proteins (OMPs) that perform many 
fundamental biological processes. The OMPs in Gram-negative bacteria are inserted and folded into the outer membrane 
by the 8-barrel assembly machinery (BAM). The mechanism involved is poorly understood, owing to the absence of a 
structure of the entire BAM complex. Here we report two crystal structures of the Escherichia coli BAM complex in two 
distinct states: an inward-open state and a lateral-open state. Our structures reveal that the five polypeptide transport- 
associated domains of BamA form a ring architecture with four associated lipoproteins, BamB-BamE, in the periplasm. 
Our structural, functional studies and molecular dynamics simulations indicate that these subunits rotate with respect 
to the integral membrane 6-barrel of BamA to induce movement of the 3-strands of the barrel and promote insertion of 


the nascent OMP. 


OMPs have important roles in Gram-negative bacteria, mitochondria 
and chloroplasts in nutrition transport, protein import, secretion and 
other fundamental biological processes!*. Dysfunction of mitochon- 
dria outer membrane proteins is linked to disorders such as diabetes, 
Parkinson disease and other neurodegenerative diseases*?. The OMPs 
are inserted and folded correctly into the outer membrane (OM) by the 
conserved OMP$85 family proteins® §, suggesting that similar insertion 
mechanisms may be used in Gram-negative bacteria, mitochondria 
and chloroplasts. 

In Gram-negative bacteria, OMPs are synthesized in the cytoplasm, 
and are transported across the inner membrane by SecYEG into the 
periplasm*’. The 17-kilodalton (kDa) protein (Skp) and the survival 
factor A (SurA) chaperones escort the unfolded OMPs across the peri- 
plasm to the BAM, which is responsible for insertion and assembly 
of OMPs into the OM!*-”. In E. coli, the BAM complex consists of 
BamA and four lipoprotein subunits, BamB, BamC, BamD and BamE. 
BamA is comprised of five amino-terminal polypeptide transport- 
associated (POTRA) domains and a carboxy-terminal OMP transmem- 
brane barrel, while the four lipoproteins are affixed to the membrane 
by N-terminal lipid-modified cysteines. Of these subunits, BamA 
and BamD are essential*®. One copy of each of these five proteins is 
required to form the BAM complex with an approximate molecular 
mass of 200 kDa (Extended Data Fig. 1). In vitro reconstitution of 
the E. coli BAM complex and functional assays showed that all five 
subunits are required to obtain the maximum activity of BAM'*"°. 
Individual structures of BamA!”~°, BamB*!?°, BamC?”®, BamD22?778 
and BamE”*”?° have previously been reported, as have complex struc- 
tures of BamD with the N-terminal domain of BamC*!, and BamB with 
POTRA 3 and 4 of BamA*”. Nonetheless, the precise mechanism of 
OMP insertion by the BAM complex is largely hampered by the lack 
of a complete structure of the BAM complex!!**. Furthermore it is 
unknown how BAM manages to insert OMPs into the OM without the 
use of ATP, proton motive forces or redox potentials*?**. 


Here we report two novel crystal structures of the E. coli BAM 
complex: BamABCDE and BamACDE. The complexes reveal a unique 
ring architecture that adopts two distinct conformations: an inward- 
open and a novel lateral-open. Furthermore, comparison of the two 
complexes reveals that the periplasmic units are rotated with respect 
to the barrel, which appears to be linked to important conformational 
changes in the 3-strands 81C-(6C of the barrel. Taken together, this 
suggests a new insertion mechanism in which rotation of the BAM 
periplasmic ring promotes insertion of OMPs into the OM. To our 
knowledge, this is the first reported crystal structure of an intramem- 
brane barrel with a lateral-open conformation. 


Unique architecture of two E. coli BAM complexes 

X-ray diffraction data of selenomethionine-labelled crystals were col- 
lected to 3.9 A resolution, and the BAM structure was determined by 
single-wavelength anomalous dispersion (SAD) and manual molecular 
replacement (Methods and Extended Data Table 1). The first structure 
contained four proteins: BamA, BamC, BamD and BamE (Fig. la-c), 
with the electron density and crystal packing indicating that the BamB 
is absent in the complex. This was confirmed by SDS-PAGE analysis 
of the crystals (Extended Data Fig. 1 and Supplementary Fig. 1). In this 
model, BamA, BamC, BamD and BamE contain residues Glu22-Ile806, 
Cys25-Lys344, Glu26-Ser243 and Cys20-Glu110, respectively. The 
machinery is approximately 115 A in length, 84 A in width and 132A 
in height (Fig. 1a). 

The architecture of BamACDE resembles a top hat with an open- 
ing in the crown. This crown is formed by the BamA §-barrel with 
the encircling POTRA domains and associated proteins forming the 
brim. The C-terminal 8-barrel of BamA projects out of the complex 
and is fully immersed in the OM, while the five POTRA domains of 
BamA and the BamD form a ring in the periplasm (Fig. la-c). The 
other subunits of the complex surround this central BamAD core. 
The coiled N-terminal loop of BamC is bound to BamD, as is its 
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N-terminal globular domain, which also interacts with POTRA 1 of 
BamA. The C-terminal globular domain of BamC interacts with BamD 
and POTRA 2. Notably, for one of the two BamACDE complexes in 
the asymmetric unit cell, no electron density was observed for the 
N-terminal and C-terminal globular domains of BamC, which may 
indicate inherent flexibility (Extended Data Fig. 2). This flexibility was 
also observed in the molecular dynamics simulations. Finally, BamE 
is found at the opposite end of the complex, coupling the C-terminal 
domain of BamD to POTRA 4 and 5 of BamA, adjacent to the barrel 
(Fig. 1b, c). 

To obtain a structure of the complex with all five subunits, we 
increased the expression level of BamB (Methods). The structure of 
BamABCDE was determined by co-crystallization with sodium iodine 
and SAD and manual molecular replacement techniques to a reso- 
lution of 2.9 A (Methods and Extended Data Table 1). The structure 
we describe below is based on this BamABCDE complex unless oth- 
erwise mentioned. The 8-strands of the C-terminal barrel of BamA 
are named 81C-816C for consistency with previous reports. The 
top hat architecture of BamABCDE is similar to that of BamACDE 
with dimensions of around 120 A in length, 98 A in width and 140A 
in height, with the periplasmic ring structure retained (Fig. 1d-f). In 
the BamABCDE structure, the opening in the crown of the top hat is 
now closed. This model of the BAM complex contains BamB (residues 
Lys31-Thr391), which is shown to bind to POTRA 2 and 3. Although 
SDS-PAGE analysis of the crystals showed BamC is intact in the 
BamABCDE crystals (Extended Data Fig. 1), electron density is only 
visible for the N-terminal loop (residues Val35 to Pro88), bound to 
BamD. This indicates that the rest of BamC is highly flexible. Molecular 
dynamics simulations of the BamABCDE and BamACDE complexes 
suggest that both complexes are otherwise stable and the periplasmic 
ring structure remains intact during the simulations (Extended Data 
Fig. 3, Supplementary Video 1 and Supplementary Figs 2-4). 


Inward - and lateral-open conformations 

In the structure of the BamABCDE complex, the extracellular loops 
(L-1 to L-8) cap the pore of BamA to completely close it to the extra- 
cellular side, while the periplasmic mouth is fully open to the peri- 
plasm (Fig. 2a, b). This conformation is similar to all reported barrel 


ARTICLE 


Figure 1 | Structure of two complexes of E. coli 
8-barrel assembly machinery. Two structures of 
E. coli BAM: BamABCDE and BamACDE. The 
BamA (red) C-terminal barrel is embedded in the 
OM, while the N-terminal domain of BamA is in 
the periplasm, forming a novel circular structure 
with lipoproteins BamB (green), BamC (blue), 
BamD (magenta) and BamE (cyan). a—c, Cartoon 
representation of the structure of the BamACDE 
complex, viewed for the membrane plane (a), 
extracellular side (b) and periplasm (c). BamD 
interacts with POTRA 1, 2 and 5 to form a ring 
structure in the periplasm, while BamC binds to 
both BamD and POTRA 1 and 2 of BamA. BamE 
forms contacts with both BamA and BamD. The 
dimensions of BamACDE were measured at the 
widest points of the outer surfaces of the complex. 
d-f, Cartoon representation of BamABCDE 
structure, viewed from the membrane plane 

(d), extracellular side (e) and periplasm (f). 
BamB interacts with POTRA 2 and 3, while only 
N-terminal loop of BamC forms contacts with 
BamD. The dimensions of BamABCDE were 
measured as in a. 


structures of BamA, however, the POTRA domains are significantly 
different (Extended Data Fig. 4). The POTRA domains of BamABCDE 
appear locked through their interactions with BamD, which together 
form a ring apparatus that may feed the unfolded OMP into the assem- 
bly machinery. It is worth noting that the 816C of both Neisseria gonor- 
rhoeae BamA and the BamABCDE complex coils towards the inside of 
the barrel lumen, creating a gap between 31C and B15C of the barrels 
(Figs. 1d, 2f and Extended Data Fig. 4), which may provide a path for 
insertion of the OMPs. 

By contrast, in the structure of BamACDE, extracellular loops L-1, 
L-2 and L-3 are displaced from the pore, opening the barrel laterally 
between 31C and 816C. This exposes the barrel lumen to both the 
extracellular leaflet of the OM and the outside of the cell (Fig. 2c, d). 
Conversely, on the periplasmic side, POTRA 5 and turn T-1 to T-4 
completely plug the barrel (Fig. 2d and Extended Data Fig. 5). The 
barrel of BamA in the BamACDE structure is therefore in a later- 
al-open conformation. The first six 3-strands, 81C to B6C, perform 
a scissor-like movement to rotate away from the pore to a maximum 
angle of around 65° and distance of about 15 A (Fig. 2e and Extended 
Data Fig. 5). The other strands of the barrel remain unchanged. These 
conformational changes open the barrel laterally to the OM and the 
extracellular side, and, in conjunction with POTRA 5, close the peri- 
plasmic mouth (Fig. 2c, d and Extended Data Fig. 5). Such a mecha- 
nism of conformational changes between inward- and outward-open 
conformations to transport small molecular substrates is common for 
ca-helical inner membrane protein transporters**. However, to our 
knowledge, this is the first crystal structure report so far of a 6-barrel 
that may alternate between both inward- and lateral-open conforma- 
tions. The novel architecture of the lateral-open conformation is likely 
to facilitate the insertion of B-strands of the OMP into the OM, while 
permitting the interlinking extracellular loops to extend out of the cell 
upon insertion. 

It was suggested that lateral separation between 81C and $16C is 
required for normal BamA function by disulfide bond cross-linking*. 
To test the two solved conformations, in vivo cross-links were designed 
to interlock BamA in one of the two conformational states. Two double 
cysteine mutations, Glu435Cys/Ser658Cys and Glu435Cys/Ser665Cys, 
were created to capture BamA in the inward-open conformation 
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5 G393C/G584C 
6 E435C 


Figure 2 | Inward- and lateral-open conformations of BAM. BamA- 

BamE are in the same colours as in Fig. 1. The functional assays and the 
western blots were repeated at least three times. a, Membrane view of the 
molecular surface of BamABCDE. The pore of BamA is completely sealed 

at the extracellular side by the extracellular loops. b, Periplasmic view of 
BamABCDE. The barrel is open to the periplasm (indicated by the arrow). 

c, Membrane view of the molecular surface of BamACDE. The barrel is open 
laterally to the OM and the extracellular side (indicated by the arrow). 

d, Periplasmic view of BamACDE surface structure. The barrel is completely 
closed to the periplasm. e, The important conformational changes of the 
BamA barrel domain between the inward-open (red) and the lateral-open 
(yellow) conformations. The barrel strands 1C-86C of BamA have been 
rotated about 65° with the distance around 15 A to open the barrel laterally 


(Fig. 2f), and one double mutation, Gly393Cys/Gly584Cys, was pro- 
duced to restrain BamA in the lateral-open conformation (Fig. 2g). 
The single cysteine mutations do not affect cell growth, while the dou- 
ble cysteine mutations Gly393Cys/Gly584Cys, Glu435Cys/Ser665Cys 
and Glu435Cys/Ser658Cys are all lethal (Fig. 2h, i). In addition, the 
double cysteine mutants are folded into the OM, and can be rescued 
by the addition of 2 mM reducing reagent Tris(2-carboxyethyl)phos- 
phine hydrochloride (TCEP) (Fig. 2j, k, Extended Data Fig. 5 and 
Supplementary Fig. 5), which breaks the disulfide bonds and therefore 
unlocks the structure, providing strong evidence that the barrel can 
exist in the two resolved conformations in the bacterial OM. 


Essential interactions between BamA and BamD 

Previous mutagenesis analysis has suggested that only POTRA 5 of 
BamA associates with BamD*’, however, before this study no struc- 
tures of this complex have been solved. In our structures, 12 residues 
of BamD interact with 17 residues of POTRA 5 (Fig. 3a, Extended 
Data Fig. 6 and Supplementary Table 1). In addition to contacts with 
POTRA 5, our structures reveal that BamD also interacts with Val480 
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from the inward-open state. f, The double mutation Glu435Cys/Ser665Cys 
or Glu435Cys/Ser658Cys is expected to lock the barrel in the inward-open 
conformation. Residues Ile806-Lys808 of the 816C of BamA coils towards 
the inside of the barrel lumen. g, The Gly393Cys/Gly584Cys mutant 

is expected to lock the barrel in the lateral-open conformation. h, The 
functional assays of the mutants. The single residue mutations do not affect 
the E. coli cell growth, but the double cysteine mutations kill the bacteria. 
Numbers 1-10 are as shown in i. i, The protein expression levels of BamA 
mutants in the OM were checked by western blotting. EV, empty vector 
without BamA; WT, wild type. j, The reducing reagent TCEP could rescue 
the double cysteine mutations at 2 mM. k, The protein expression levels of 
BamA double cysteine mutants in absence and in present of TCEP were 
checked by western blotting. 


and Asp481 of periplasmic turn T-2 of the BamA barrel and also forms 
contacts with POTRA 1 and 2. These interactions complete the ring 
structure (Extended Data Fig. 7 and Supplementary Table 1). 

Molecular dynamics simulations of only the core BamAD periplas- 
mic ring from both structures retain the cyclic complex. Removal of 
BamD markedly increases the dynamics of POTRA 1 and 2 in the 
BamACDE conformation (Extended Data Fig. 3 and Supplementary 
Video 1). In this instance, the POTRA domains rotate in an anticlock- 
wise direction towards the OM. This rotation also causes POTRA 3 
to separate from the T-5 and T-6 periplasmic turns of the BamA bar- 
rel (Extended Data Fig. 3 and Supplementary Video 1). By contrast, 
simulations of only BamA from the BamABCDE structure, results in 
POTRA 2 coupling to POTRA 5, thereby stabilizing POTRA 1 and 2. 
However, to achieve this configuration, POTRA 3 and 4 separate from 
the barrel, with a degree of deformation to POTRA 3, suggesting that 
this is a strained conformation. 

To test whether the BamA and BamD interactions are required for 
BAM function, BamA POTRA 5 mutants Glu373Lys and Arg366Glu 
were generated. Functional assays showed that Arg366Glu severely 
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impairs cell growth, whereas Glu373Lys is lethal to the E. coli cells 
(Extended Data Fig. 7). In the structures, BamA Glu373 and BamD 
Arg197 form a salt-bridge (Fig. 3a). An Arg197Leu BamD mutation 
was able to rescue BamA Glu373Lys*”. 


BamB regulates BamA conformation 

The most apparent differences between the two solved structures are 
the presence of BamB in the BamABCDE complex, while BamC is more 
clearly resolved in the BamACDE complex (Fig. 2a, c). In addition, 
the POTRA domains of BamA are found in two distinct conforma- 
tions, with a larger separation observed between POTRA 3 and 5 in the 
BamACDE complex. Speculatively, this could be due to the absence of 
BamB, which binds to POTRA 2 and 3 in the BamABCDE complex and 
is known to have a regulatory role*”””?, The overall interface between 
BamA and BamB is around 1,080 A?, and is comprised of the three 
B-strands of POTRA 3 and a loop consisting of residues Thr245-Lys251 
that anchors to the centre of the BamB (-propeller (Fig. 3b, Extended 
Data Fig. 7 and Supplementary Table 2). The BamB loops at the BamA 
binding side adopt conformational changes to bind to POTRA 3 
(refs 21, 22; Extended Data Fig. 7), consistent with the BamB structure 
in complex with POTRA 3 and 4 (ref. 32). BamB also interacts with 
residues Lys135 and Tyr147 of POTRA 2 (Fig. 3b). As a result, the 
binding of BamB appears to induce local conformational changes in 
POTRA 2 and 3 with a root mean square deviation (r1.m.s.d.) of 3.57 A 
over 159 Ca atoms (Extended Data Fig. 7). 

Both periplasmic turns T-5 and T-6 of the BamA barrel are more 
ordered in the BamABCDE structure and interact with POTRA 3 
(Extended Data Fig. 6). In the BamACDE complex, POTRA 3 separates 
from the periplasmic turns (Fig. 2c), indicating that BamB may have a 
role in controlling the structural rearrangements of the barrel through 
POTRA 3. It is worth noting that POTRA 5 also has extensive contacts 
with the periplasmic turns T-1, T-2 and T-3 of the barrel domain in 
the BamABCDE structure (Extended Data Fig. 6), and we speculate 
that the conformational changes of POTRA domains may play a role 
in controlling the conformations of the barrel. 

In the molecular dynamics simulations, the tight coupling between 
POTRA 3 and BamB was retained. However, in the absence of 
BamB, the simulations reveal greater dynamics of POTRA 3 and 4, 
with both domains moving away from the barrel and the membrane 
(Supplementary Video 1). This suggests that BamB is important for 
coupling of POTRA 3 at the appropriate height with respect to the 
barrel and OM. 


BamE and BamC interactions with BamA and BamD 

Previous studies suggested that BamE only binds to BamD directly**”. 
Surprisingly, both BamABCDE and BamACDE structures show that 
BamE is not only positioned between BamA and BamD, but also 
forms contacts with BamC (Fig. 4a, Extended Data Figs 6 and 8 and 
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Figure 3 | BamA interacts with BamD and 
BamB in BamABCDE complex. BamD contacts 
POTRA domains 1, 2 and 5 to forma ring 
structure. a, BamA POTRA 5 interacts with the 
C-terminal domain of BamD. BamA residues 
Arg366 and Glu373 and BamD residue Arg197 
are important for the two protein interactions, 
and their carbon atoms are coloured in yellow. 

b, BamA and BamB interaction. Both POTRA 2 
and 3 involve in BamB interaction. BamA residues 
Val245, Tyr255 and BamB residues Leu192, 
Leul94 and Arg195 have important roles in BamA 
and BamB interactions. 


Supplementary Tables 3 and 4). Notably, the BamE residues Pro67 and 
Phe68 also interact with BamC residues Met56 and Ile57 (Extended 
Data Fig. 8), suggesting that BamC, BamD and BamE form a network 
to regulate the conformations of BamA. 

BamE residues Ile32, Gln34, Leu63 and Arg78 interact with both 
BamA and BamD. Mutations to any of these residues caused defects 
to the OM”’. Additionally, single BamE mutations Asn36 and Tyr37 
at the BamE and BamA interface or residues Met64, Asp66, Phe74, 
Val76 and G1n88 at the BamE and BamD interface caused defects of 


R346 P409 


VA 


Figure 4 | BamE and BamC interact with BamA and BamD. a, The 
interface between BamE and BamA in the BamABCDE complex. BamE 
forms contacts with POTRA 5 residues, BamA periplasmic turns T-2 and 
T-3, and POTRA 4 in the BamACDE complex (Extended Data Fig. 6). 

b, The C-terminal globular domain of BamC interacts with BamA POTRA 
2 at the 8-sheets in BamACDE. Residues in the two (3-sheets that are 
involved in the BamC and BamA interactions are shown. 
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Figure 5 | The function of BamA POTRA 5. a, The three $-strands of 
BamA POTRA 5. The residues selected for functional assays are shown: 
residues Lys351 and Arg353 on B1, Thr397, Asp399 and Asp401 on 82 and 
Val415, Lys417 and Lys419 on 83. b, The 8-strands of POTRA 5 are crucial 
for bacterial survival. The functional assays and the western blots were 
repeated at least three times. The single mutant Lys351Pro, the double 
mutant Lys351Pro/Arg353Pro at 31, the double mutants Val415Pro/ 
Lys417Pro, Lys417Pro/Lys419Pro and Val415Pro/Lys419Pro at 33 kill the 
bacteria, while the triple mutant Thr397Pro/Asp399Pro/Asp401Pro at 82 
impairs cell growth. The protein expression of the BamA wild-type and 
mutants. Numbers 1-12 are as shown below. 


the OM”. These data suggest that BamE has an important role in 
OMP assembly. 

The whole BamC structure is revealed in the BamACDE complex. 
BamC forms extensive contacts with BamD, with the average inter- 
face of 2,686 A? (Fig. 4b, Extended Data Fig. 8 and Supplementary 
Table 5). The N-terminal loop of BamC, up to residue Gly94, is largely 
unstructured, coiling round BamD and forming a cluster of contacts 
with BamE. The N-terminal globular domain of BamC interacts with 
the N-terminal domain of BamD and POTRA 1 of BamA (Extended 
Data Fig. 8). The C-terminal globular domain interacts principally with 
POTRA 2, via their -sheets (Fig. 4b). 

The C-terminal globular domain of BamC binds to POTRA 2 in one 
of the BamACDE complexes. This probably enhances the periplasmic 
ring structure formed by BamA and BamD, and has a role in the con- 
formational changes of BAM. Comparison of the two BamACDE com- 
plexes in the asymmetric unit reveals differences in the barrel domain, 
with an rm.s.d. of 0.91 A over 378 Ca. atoms, while the periplasmic ring 
is somewhat rotated, with respect to the barrel (Extended Data Fig. 2). 
SDS-PAGE analysis confirmed that both BamACDE and BamABCDE 
crystals contain full-length BamC (Extended Data Fig. 1), suggest- 
ing that the two globular domains of BamC are dynamic. Molecular 
dynamics simulations, with the addition of the BamC globular domains 
to this structure, also show that these domains are not tightly coupled 
to the complex. Indeed, in the simulations of both complexes, BamC 
shows the least stability of all five subunits. The total interface between 
BamC with BamA is around 794 A? in the BamACDE structure 
(Fig. 4b, Extended Data Fig. 8 and Supplementary Table 6). The molec- 
ular dynamics simulations of BamACDE complex without BamC sug- 
gest that the POTRA 1 moves towards to the membrane, while POTRA 
3 moves towards the barrel and engages with the periplasmic turns T-5 
and T-6 (Extended Data Fig. 3). In the simulations of the inward-open 
BamABCDE complex the absence of BamC globular domains has lim- 
ited effects on the overall structure, as BamD is more tightly coupled to 
the POTRA domains. Taken together, all four lipoproteins BamB, C, D 
and E have direct contacts with BamA POTRA domains, which may be 
important in terms of conformation and functional regulation. Analysis 
of the BAM subunits reveals that conserved residues are mapped to 
those regions involved in inter-subunit protein-protein interactions 
(Extended Data Fig. 9 and Supplementary Fig. 6). 

All the POTRA domains of BamA have the Baa8 fold’’. An NMR 
study suggested that the B-sheet of the POTRA domains may bind 
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substrate in a non-specific manner*’. Our structural studies show 
that both BamB and the C-terminal domain of BamC bind to the 
POTRA domains 2 and 3 through their B-sheets. The three B-strands 
of POTRA 5 adopt an important conformational change, from 126° 
between the 81C and the POTRA 5 (-sheet (83) in the inward-open 
state to 165° in the lateral-open state, aligning the 3-sheet of POTRA 
5 with B1C. Furthermore, almost all BamA of Gram-negative bacte- 
ria, SAM50 of mitochondria and OEP80 of chloroplasts have the last 
POTRA domain, indicating that the POTRA 5 of BamA may play an 
important role in the insertion of OMPs. To test this possibility, single 
proline substitutions (Lys351Pro, Arg353Pro, Thr397Pro, Asp399Pro, 
Asp401Pro, Val415Pro, Lys417Pro, Lys419Pro), double proline substi- 
tutions (Lys351Pro/Arg353Pro, Thr397Pro/Asp399Pro, Asp399Pro/ 
Asp401Pro, Thr397Pro/Asp401 Pro, Val415Pro/Lys417Pro, Lys417Pro/ 
Lys419Pro, Val415Pro/Lys419Pro) and a triple proline substitution 
(Thr397Pro/Asp399Pro/Asp401Prp) were generated in 81, 82 and 83 
of BamA POTRA 5. Functional assays showed that all single mutants 
(except Lys351Pro) and double mutants at 82 (Thr397Pro/Asp399Pro, 
Asp399Pro/Asp401Pro, Thr397Pro/Asp401Pro) did not affect E. coli 
cell growth, but single substitution Lys351Pro at 31, double proline 
substitutions at 81 (Lys351Pro/Arg353Pro) and $3 (Val415Pro/ 
Lys417Pro, or Lys417Pro/Lys419Pro, or Val415Pro/Lys419Pro) 
are lethal, and the triple mutation at 82 (Thr397Pro/Asp399Pro/ 
Asp401Pro) impaired the cell growth (Fig. 5a, b). This strongly sug- 
gests that the 8-sheet, especially B-strands 1 and 3 of POTRA 5, may 
have a critical role in OMPs insertion, possibly by 8-augmentation of 
the unfolded OMPs. 


Mechanism and conclusion 

In Gram-negative bacteria, outer membrane barrel proteins are inserted 
and assembled into the OM by the BAM complex. Our studies have 
revealed the three-dimensional architecture of the entire E. coli BAM 
complex, trapped in two distinct conformational states. The struc- 
tures suggest that a rotation of the periplasmic ring (Extended Data 
Figs 2, 5 and Supplementary Videos 1, 2) and conformational changes 
of the POTRA domains and BamB-BamE (Extended Data Fig. 10) 
induces the considerable conformational changes to the barrel of BamA 
required for BAM-induced OMP insertion (Fig. 2e). Considering 
all four lipoproteins subunits, BamB-BamE, directly interact with 
POTRA domains, the ring architecture of the E. coli BAM complex 
may be an efficient way to coordinate all BAM subunits and thereby 
promote OMPs insertion into the OM (Extended Data Fig. 10 and 
Supplementary Video 2). 

To accomplish insertion, the OM periplasmic lipid head groups must 
be circumnavigated by the unfolded or partially folded OMPs"’. Several 
mechanisms for OMP insertion have previously been described!*°, 
with the ‘BamA-assisted model and the ‘budding model currently the 
two most favoured*®. 

Our structures reveal a 30° rotation of the periplasmic ring complex, 
which interacts directly with the lipid headgroups of the OM (Extended 
Data Figs 5, 10). This rotation is probably coupled to the 65° tilting 
of strands 81C-B6C of the BamA barrel and the partial separation of 
the lateral gate, formed by $1C and 816C (Fig. 2c, e). This exposes the 
barrel lumen to the core of the OM, while also inducing a degree of 
membrane instability to facilitate OMP insertion. The BamA homo- 
logue, SAM50, in mitochondria will probably use a similar scissor-like 
movement of the barrel strands to promote OMPs insertion into the 
mitochondrial OM; however, this is performed in absence of the peri- 
plasmic ring. 

In summary, our structural, functional and molecular dynamics sim- 
ulations have revealed that the BAM complex has a unique ring archi- 
tecture and is able to adopt both inward-open and lateral-open states. 
We propose that these structures represent the resting (BamABCDE) 
and post-insertion (BamACDE) states of the complex. These findings 
shed an important light on how the BAM subunits work together to 
insert unfolded OMPs into the OM without using ATP, and sets up 
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an important platform for further studies of OM biogenesis and the 
potential development of novel therapies, for example by inhibiting 
complex formation. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Cloning, expression and purification of BAM complex. Expression plasmid 
pJH114 containing the five E. coli bamABCDE genes which were under the con- 
trol of a tre promoter, and with an octa-histidine (8 x His) tag at the C terminus 
of bamE was initially used for overexpression of BamABCDE complex in E. coli 
HDB150 cells!°. Expression of the native BamABCDE complex was induced with 
100jmoll” isopropyl-S-p-1-thiogalactopyranoside (IPTG; Formedium) at 20°C 
overnight when the absorbance of the cell culture at 600 nm reached 0.5-0.8. The 
selenomethionine-labelled BAM complexes were expressed in M9 medium sup- 
plemented with selenomethionine Medium Nutrient Mix (Molecular Dimensions) 
and 100 mg1~! L-(+)-selenomethionine (Generon) using the similar conditions 
as the native BamABCDE. 

Both native and selenomethionine-labelled BamABCDE complexes were puri- 
fied using a similar protocol. In brief, the cells were pelleted and resuspended in 
lysis buffer containing 20 mM Tris-HCl, pH 8.0, 150 mM NaCl, 10g ml~ ! DNase 
I and 100,g ml”! lysozyme and lysed by passing through a cell disruptor 
(Constant Systems) at 206 MPa. The lysate was centrifuged to remove the cell 
debris and unbroken cells, and the supernatant was ultracentrifuged to pellet the 
membranes at 100,000g for 1h. The cell membranes were resuspended in sol- 
ubilization buffer containing 20 mM Tris-HCl, pH 8.0, 300 mM NaCl, 10mM 
imidazole and 1-2% n-dodecyl-6-p-maltopyranoside (DDM; all detergents were 
purchased from Anatrace) and rocked for 1h at room temperature or overnight 
at 4°C. The suspension was ultracentrifuged and the supernatant was applied to 
a 5-ml pre-equilibrated HisTrap HP column (GE Healthcare). The column was 
washed with wash buffer containing 20 mM Tris-HCl, pH 8.0, 300 mM NaCl and 
35mM imidazole and eluted with elution buffer containing 300 mM imidazole. 
The eluent was applied to HiLoad 16/600 Superdex 200 prep grade column (GE 
healthcare) pre-equilibrated with gel filtration buffer containing 20 mM Tris-HCl, 
pH7.8, 300mM NaCl and detergents. Different detergents were used in protein 
purification procedures. 

The purified BamABCDE complex was analysed by SDS-PAGE (Extended 
Data Fig. 1 and Supplementary Fig. 1), which indicated that BamB is not 
enough in the complex, and BamB is absent in the determined structure. We 
therefore decided to generate a new plasmid to express the BamABCDE com- 
plex. Additional copy of the E. coli bamB gene was introduced into pJH114 
(ref. 16) after the 8 x His tag to generate a new expression plasmid pYG120 using 
a modified sequence and ligation-independent cloning (SLIC) method”. In 
brief, vector backbone and bamB gene fragments were amplified by PCR using 
Q5 Hot Start High-Fidelity DNA Polymerase (New England BioLabs), and plas- 
mid pJH114 as template and primers PF_pJH114_SLIC (5’-GTTAATCGACC 
TGCAGGCATGCAAG-3’) and PR_pJH114_SLIC (5’-CTCTAGAGGATCTTAG 
TGGTGATGATGGTG-3’), and PF_EBB_SLIC (5/-TCATCACCACTAAGATCCT 
CTAGAGAGGGACCCGATGCAATTGC-3’) and PR_EBB_SLIC (5’-CTT 
GCATGCCTGCAGGTCGATTAACGTGTAATAGAGTACACGGTTCC-3’), 
respectively. Gel-extracted fragments were digested by T4 DNA polymerase 
(Fermentas) at 22°C for 35 min followed by 70°C for 10 min, and then placed on 
ice immediately. The digested fragments were annealed in an annealing buffer 
(10 mM Tris, pH 8.0, 100 mM NaCl and 1mM EDTA) by incubating at 75°C 
for 10 min and decreasing by 0.1°C every 8s to 20°C. The mixture was trans- 
formed into E. coli DH5a for plasmid preparation. The DNA sequences were 
confirmed by sequencing. For the purification of the BamABCDE complex from 
the pYG120 construct, the wash buffer, elution buffer and gel filtration buffer 
were supplemented with different detergent combinations. A second gel filtration 
was performed to change detergents with gel filtration buffer containing 1 CMC 
N-octyl-8-p-glucopyranoside (OG) and 1 CMC N-dodecyl-N,N-dimethylamine- 
N-oxide (LDAO). For BamABCDE complex purification from construct pJH114, 
the wash buffer, elution buffer and gel filtration buffer were supplemented with 2 
CMC N-nonyl-3-p-glucoside (8-NG) and 1 CMC tetraethylene glycol monooc- 
tyl ether (C8E4). The peak fraction was pooled and concentrated using Vivaspin 
20 centrifugal concentrator (Sartorius, molecular mass cut off: 100 kDa). The 
selenomethionine-labelled proteins were purified in the same way as the native 
proteins of BamABCDE complex. 

Crystallization, data collection and structure determination. The purified 
proteins were concentrated to 8-12 mgm! for crystallization. For Nal co- 
crystallization, NaCl was replaced by Nal in the gel filtration buffer. All crystalliza- 
tions were carried out by sitting-drop vapour diffusion method in the MRC 96-well 
crystallization plates (Molecular Dimensions) at 22°C. The protein solution was 
mixed in a 1:1 ratio with the reservoir solution using the Gryphon crystallization 
robot (Art Robbins Instruments). The best Nal co-crystallized crystals were grown 
from 150mM HEPES, pH 7.5, 30% PEG6000 and CYMAL-4 in MemAdvantage 
(Molecular Dimensions) as additive. The best native crystals were grown from 
150mM HEPES, pH7.5 and 27.5% PEG6000. The best selenomethionine- 
labelled crystals were grown from 100 mM Tris, pH 8.0, 200 mM MgCl,:6H20, 


24% PEG1000 MME and OGNG in MemAdvantage as additive. The crystals were 
harvested, flash-cooled and stored in liquid nitrogen for data collection. The data 
sets of selenomethionine labelled BAM complex were collected on the 103 beam- 
line at Diamond Light Resources (DLS) at a wavelength of 0.9795 A. All data were 
indexed, integrated and scaled using XDS“*. The crystals belong to space group 
of P42,2, with the cell dimensions a= b = 254.16 A, c= 179.22, a= 3=y=90°. 
There are two complexes in the asymmetric unit. The structure was determined to 
3.9A resolution (Extended Data Table 1) using ShelxD‘*, Fifty-six selenium sites 
were found, which gave a figure of merit (FOM) of 0.32. After density modification 
using DM", the BamACDE complex was clearly visible in the electron density map, 
but without BamB. Using the individual high-resolution models, the BamACDE 
complex was built using Coot’” by skeletonizing the electron density map and 
docking the BAM subunits in the electron density map with selenomethionine sites 
used as guides. Rigid body refinement was performed following manual docking. 
NCS refinement was used along with TLS refinement against groups automatically 
determined using PHENIX™. Restrained refinement was performed with group 
B-factors alongside reference model secondary structure restraints from higher 
resolution models. Weights were automatically optimised by PHENIX“. 

To obtain the BamABCDE complex structure, the new construct was used 
to produce sufficient BamB to form the BamABCDE complex. The data sets 
of BamABCDE complex were collected on the 102 beamline at DLS. The crys- 
tals belong to space group P4;2;2, with the cell dimensions a= b= 116.69 A, 
c=435.19A, a=3=y=90°. There is one complex molecule in the asymmetric 
unit. Although the crystals diffracted to 2.90 A, the crystal structure of BamABCDE 
could not be determined by molecular replacement. BamABCDE complex was 
crystallized in presence of 0.2 M sodium iodide, and SAD data sets were collected 
at a wavelength of 1.8233 A. Four 360° data sets were collected on different regions 
of the same crystal of Nal co-crystallization then combined. The phases were deter- 
mined by ShelxD**“* at 4 A resolution. Eleven iodide sites were found, which gave 
a FOM of 0.28. The phases were extended to 2.90 A by DM**, and the model was 
built using Coot’ by skeletonizing the electron density map and docking the indi- 
vidual high-resolution subunits in the electron density map and rigid body fit this 
model into the higher resolution native data set while retaining and extending the 
free R set from the iodide data set. The BamABCDE complex was refined using 
PHENIX*. TLS groups were automatically determined using PHENIX“* and used 
for refinement along with individual B-factors. Weights were automatically opti- 
mised and secondary structure restraints were used. 

Site-directed mutagenesis and functional assays. An E. coli bamA expres- 
sion plasmid was constructed for functional assays using SLIC method as 
described above. An N-terminal 10 x His tag fused with bamA starting from 
residue 22 was amplified by PCR using Q5 Hot Start High-Fidelity DNA 
Polymerase (New England BioLabs), and plasmid pJH114 as template and 
primers PF_bamA_SLIC (5‘-CCATCATCATCATCATCATCATCATGAAG 
GGTTCGTAGTGAAAGATATTCATTTCGAAG-3’) and PR_bamA_SLIC 
(5'/-AGACTCGAGTTACCAGGTTTTACCGATGTTAAACTGGAAC-3’). Vector 
backbone was amplified from a modified pRSFDuet-1 vector (Novagen, Merck 
Millipore) containing an N-terminal pelB signal peptide coding sequence 
with primers PF_RSFM_SLIC (5/-CGGTAAAACCTGGTAACTCGAGTCT 
GGTAAAGAAACCGCTGC-3’) and PR_RSFM_SLIC (5’-ATGATGATGAT 
GATGATGATGATGGTGATGGGCCATCGCCGGCTG-3’). Plasmids were pre- 
pared using GeneJET Plasmid Miniprep Kit (Thermo Scientific). Site-directed 
mutagenesis was performed according to a previously described protocol” with 
slight modification (PCR conditions and the sequences of the primers are avail- 
able on request). The sequences of the wild type and all mutant constructs of 
BamA were confirmed by sequencing. E. coli JCM166 cells’ transformed with 
the wild-type BamA or its mutants were plated on LB agar plates supplemented 
with 501g ml~! kanamycin and 100g ml! carbenicillin in the presence or 
absence of 0.05% L-(++)-arabinose and grown overnight at 37°C. Single colo- 
nies grown on arabinose-containing plates were inoculated in 10 ml LB medium 
supplemented with 501g ml! kanamycin, 100g ml! carbenicillin and 0.025% 
L-(+)-arabinose, and incubated at 200 r.p.m. at 37°C for 16h. For plate assays, 
the cells were pelleted and resuspended in fresh LB medium supplemented with 
501g ml! kanamycin and 100 Lg ml! carbenicillin, and diluted to an A¢00 nm of 
~0.3 and streaked onto LB agar plates supplemented with 501g ml! kanamycin, 
100 1g ml7! carbenicillin in the presence or absence of 0.05% L-(++)-arabinose and 
cultured at 37°C for 12-14h. 

Western blot. Western blotting was performed to examine protein expression 
levels of BamA in the membrane. 50 ml of overnight cultures of transformed 
JCM166 cells with respective wild-type or each mutant of BamA were pelleted. 
The cells were resuspended in 25 ml 20 mM Tris-HCl, pH 8.0, 150 mM NaCl and 
sonicated. The cell debris and unbroken cells were removed by centrifugation at 
7,000g for 30 min. The supernatant was centrifuged at 100,000g for 60 min and 
the membrane fraction was collected. The membrane fraction was suspended 
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in 5 ml buffer containing 20 mM Tris-HCl, pH 8.0, 150 mM NaCl and 1% 3-(N, 
N-dimethylmyristylammonio)-propanesulfonate (Sigma) and solubilized for 30 
min at room temperature. Samples were mixed with 5 x SDS-PAGE loading buffer, 
heated for 5 min at 90°C, cooled for 2 min on ice and centrifuged. Ten microlitres 
of each sample was loaded onto 4-20% Mini-PROTEAN TGX Gel (Bio-Rad) for 
SDS-PAGE and then subjected to immunoblot analysis. 

The proteins were transferred to PVDF membrane using Trans-Blot Turbo 
Transfer Starter System (Bio-Rad) according to the manufacturer's instructions. 
The PVDF membranes were blocked in 10 ml protein-free T20 (TBS) blocking 
buffer (Fisher) overnight at 4°C. The membranes were incubated with 10 mL His- 
Tag monoclonal antibody (diluted, 1:1,000) (Millipore) for 1h at room temperature 
followed by washed with PBST four times and incubated with IRDye 800CW goat 
anti-mouse IgG (diluted, 1:5,000) (LI-COR) for 1h. The membrane was washed 
with PBST four times and PBS twice. Images were acquired using LI-COR Odyssey 
(LI-COR). 

BamA heat-modifiability assays. The JCM166 cells containing the double cysteine 
mutants Gly393Cys/Gly584Cys, Glu435Cys/Ser665Cys and Glu435Cys/Ser658Cys 
of BamA were cultured overnight in LB medium with 50,.g ml“! kanamycin, 
100g ml! carbenicillin and 0.025% L-(+)-arabinose, respectively. The mem- 
brane fraction from 50 ml cells was isolated and solubilized as described above. 
The samples were mixed with SDS loading buffer and then boiled for 5 min or kept 
at room temperature for 5-10 min. SDS-PAGE was performed at 4°C by running 
the gel for 60 min at 150 V. The proteins were transferred to PVDF membrane as 
described above and the BamA mutants were detected by western blotting. 
Molecular modelling and simulations. All molecular dynamics simulations were 
performed using GROMACS v5.0.2 (ref. 50). The Martini 2.2 force field®! was 
used to run an initial 1 jus Coarse Grained (CG) molecular dynamics simulation 
to permit the assembly and equilibration of a 1-palmitoly, 2-cis-vaccenyl, phos- 
phatidylglycerol (PVPG): 1-palmitoly, 2-cis-vaccenyl, phosphatidylethanolamine 
(PVPE) bilayers around the BamABCDE complexes™. Using the self-assembled 
system as a guide the coordinates of the BAM complexes were inserted into an 
asymmetric model E. coli OM, comprised of PVPE, PVPG, cardiolipin in the peri- 
plasmic leaflet and the inner core of Rd1 LPS lipids in the outer leaflet*’, using 
Alchembed™. This equated to a total system size of ~500,000 atoms. The sys- 
tems were then equilibrated for 1 ns with the protein restrained before 100 ns of 
unrestrained atomistic molecular dynamics using the Gromos53a6 force field”. 
The lipid-modified cysteine parameters were created from lipid parameters for 
diacylglycerol and palmitoyl and appended to the parameters of the N-terminal 
cysteines”®. Systems were neutralised with Mg’* ions, to preserve the integrity of 
the outer leaflet of the OM, and a 150mM concentration of NaCl. 

All ~500,000 atom systems were all run for 100 ns, with box dimensions in the 
region of 200 x 200 x 150 A®. To assess the stability of the subunit stoichiometry 
we assessed various combinations of BAM assemblies. For both BamACDE and 
BamABCDE crystal structures, we investigated ABCDE, AD and A alone, with 
three repeats each; while single simulations were also performed for BamABD, 
ACD, ADE, ABDE and ACDE, with a total simulation time equating to 2.8 j1s. In 
cases where domains or subunits were missing these were added to the complex by 
structurally aligning the resolved units from the companion structure. For BamB, 
this was added to the BamACDE complex by structurally aligning POTRA 3. 
For the full BamC, this was added to the BamABCDE by aligning the resolved 
N-terminal domains. Individual protein complexes were configured and built using 
Modeller*’ and PyMOL (The PyMOL Molecular Graphics System, version 1.8, 
Schrédinger, LLC). All simulations were performed at 37°C, with protein, lipids 
and solvent separately coupled to an external bath, using the velocity-rescale ther- 
mostat>®, Pressure was maintained at 1 bar, with a semi-isotropic compressibility 
of 4 x 10-° using the Parinello-Rahman barostat*. All bonds were constrained 
with the LINCS algorithm". Electrostatics was measured using the Particle Mesh 
Ewald (PME) method, while a cut-off was used for Lennard-Jones parameters, 
with a Verlet cut-off scheme to permit GPU calculation of non-bonded contacts. 
Simulations were performed with an integration time-step of 2 fs. 
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The linear interpolation between the three structures was performed using the 
morph operation in Gromacs tools’. Analysis of the molecular simulations was 
performed using Gromacs tools*’, MDAnalysis® and locally written scripts. 

Conservation analysis was performed using Consurf®. For each subunit, 150 
homologues were collected from UNIREF90® using three iterations of CSI-Blast®, 
with an E-value of 0.0001. The Consurf scores were then mapped into the B-factor 
column for each of the subunits. 
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Extended Data Figure 1 | BamABCDE and BamACDE complexes and 
electron density maps contoured at 1o. a, Schematic diagram of the five 
BAM subunits. P-1 to P-5 represent the five BamA POTRA domains. 

b, SDS-PAGE analysis of the BAM complex from crystals. M, molecular 
mass marker; 1 and 2 are crystals of the purified BAM complex expressed 
by construct pYG120 and pJH114, respectively (Supplementary Fig. 1). 
The BamABCDE crystals contain the full-length BamA-BamE. The 
crystals were washed five times in fresh reservoir solution, and then 
dissolved in SDS-PAGE loading buffer. The results showed that the BamB 
is absent in the BamACDE crystals, while the BamC is complete in both 


the BamABCDE and BamACDE crystals. c, SDS-PAGE analysis of the 
purified BAM complex. The BAM complexes expressed from pJH114 is a 
mixture of BamABCDE and BamACDE complexes (Supplementary Fig. 1). 
d, 2F, — F, electron density map of BamA residues Trp576-Lys580 of 
BamACDE contoured at lo. e, 2F, — F. electron density map of BamD 
residues Tyr177-Trp191 of BamACDE contoured at lo. f, 2F, — F, electron 
density map of BamA residues Tyr504-Tyr509 and Phe490-Phe494 of 
BamABCDE complex contoured at lo. g, 2F, — F, electron density map of 
BamB residues Tyr345—Trp348 contoured at lo. 
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Extended Data Figure 2 | Superimposition of the two BamACDE is observed and the rest of BamC is disordered. The overall structures of 
complexes in the asymmetric unit. The BamACDE complex with the the two complexes are very similar with some conformational changes in 
full-length BamC, showing BamA (red), BamC (blue), BamD (magenta) the B-strands of barrel and extracellular loops with r.m.s.d. of 0.908 A over 
and BamE (cyan). Only N-terminal loop of BamC was observed in another 378 Ca atoms, while the periplasmic circular structure has some rotation 
BamACDE complex in the asymmetric unit cell (yellow). The structure (see arrows) with a r.m.s.d. of 4.706 A over 385 Ca atoms. b, Periplasmic 


data suggests that the role of BamC is to retain the ring structure of BamA view of the superimposition of the two structures. The periplasmic circular 
and BamD during OMP insertion. a, Membrane view of the superimposed _ structure has some rotations when the C-terminal global domain binds 


BamACDE complexes. The primary difference is one complex has a on the POTRA 2. c, Superimposition of the barrels of the two complexes. 
complete BamC subunit, which binds BamD, BamE, POTRA 1 and 2, d, Superimposition of the two BamCs. The N-terminal coil structures 
while the second complex only the N-terminal coil structure up to Pro88 superimpose well with a r.m.s.d. of 0.807 A over 86 Ca atoms. 
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a BamABCDE 


Extended Data Figure 3 | Molecular dynamics simulation of BAM 
complexes. a, b, BamABCDE (a) and BamACDE (b) structures modelled 
with all subunits present and embedded in a model E. coli outer membrane 
(grey). Phosphate atoms are shown in orange spheres. Lipid-modified 
cysteine residues of BamB, BamC, BamD and BamE are shown in yellow 
spheres. c, Both complexes are stable in molecular dynamics simulations, 
showing limited deviation from the starting configuration (shown in the 
background). d, Simulations of the complexes of only BamA and BamD 
subunits retain the ring structure. Without BamC present POTRA 1 


b BamACDE 


8 


c é BamABCDE d BamAD e BamA-only 


(black circle) moves towards the membrane, while POTRA 3 (black arrow) 
moves towards and interacts with the periplasmic loops of the barrel. The 
dynamics of POTRA 3 appear to be modulated by BamB. e, Simulations 

of BamA show enhanced dynamics of the POTRA domains, with POTRA 
1 and 2 rotating towards the membrane in an anti-clockwise direction 
(blue arrow). This separates POTRA 3 from the barrel (black arrow). This 
conformation of the POTRA domains is unable to form the BAM ring, 
highlighting the essential nature of BamD and its interactions with BamA 
in maintaining the ring structure. 
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a 

c 
Extended Data Figure 4 | BamA of the BamABCDE complex is (PDB accession 4N75)”. The two barrel structures superimpose well 
superimposed onto the other published BamA structures. All the with a r.m.s.d. of 0.644 A over 385 Ca atoms, but differences are observed 
published BamA structures are in the inward-open conformation. In for the 816C terminal residues. The C-terminal residues in BamA of 
all cases the BamA from BamABCDE is shown in red. a, The BamA of BamABCDE move towards to the lumen of the barrel. c, BamA of E. coli 
BamABCDE complex is superimposed onto BamA of N. gonorrhoeae (yellow) (PDB accession 4C4V)” with a r.m.s.d. of 1.382 A over 365 barrel 
(grey) (PDB accession 4K3B)!*. The two barrel structures are similar with Ca atoms. The conformations of the POTRA 5 are quite different. 
ar.m.s.d. of 3.803 A over 385 Ca atoms, but the conformations of the d, BamA of Haemophilus ducreyi (green) (PDB accession 4K3C)'®. The 
five POTRA domains are quite different. The dotted circle indicates the barrel structures are similar with a r.m.s.d. of 2.376 A over 365 barrel Ca 
hydrophobic gap between 81C and 315C. b, BamA of E. coli (magenta) atoms, but the conformations of POTRA 4 and 5 are quite different. 
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Extended Data Figure 5 | The conformational changes between the 
BamABCDE and BamACDE complexes and heat-modifiability 

assays of the BamA double cysteine mutants. The two structures are 
superimposed onto the BamA barrel structures of BamABCDE and 
BamACDE complexes with a r.m.s.d. of 4.85 A over the 379 barrel Ca 
atoms and a maximum r.m.s.d. of 20 A. The POTRA domains align 

with an r.m.s.d. of 5.764 A over 384 Ca atoms with maximum 15 A. 

The BamABCDE complex is in the same colour scheme as Fig. 1. 

The BamACDE complex is in yellow. The barrel strands (1C-(6C rotate 
around 65° from BamABCDE to BamACDE, while the BAM periplasmic 
unit rotates around 30° in a anti-clockwise direction from BamABCDE to 
BamACDE. a, Membrane view of the superimposition of the BamABCDE 


Empty vector 

G393C/G584C 
E435C/S665C 
E435C/S658C 


WT 


U— 


a-HIS 


F — 


Bol - + - + - + = + = + 


and BamACDE complexes. The conformations of BamA POTRA domains, 
BamB, BamC and BamD are considerably different between the two 
complexes. b, The periplasmic view of the superimposition of BamABCDE 
and BamACDE. The circular units rotate around 30° between the two 
BAM complexes. c, The residues involved in closing the barrel at the 
periplasmic side in the BamACDE structure. d, Heat-modifiability assays 
of the BamA double cysteine mutants. SDS-PAGE/western blot analysis of 
the wild-type BamA, BamA Gly393Cys/Gly584Cys, Glu435Cys/Ser665Cys 
and Glu435Cys/Ser658Cys mutants showed the heat-modifiability, 
indicating that the three double cysteine BamA mutants were correctly 
folded into the OM. F, folded; U, unfolded. See Supplementary Fig. 5. 
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Extended Data Figure 6 | Periplasmic loops bind to BamA POTRA 3, 5, complex. c, In the BamACDE complex no interactions are observed 
BamD and BamE. In the BamABCDE complex, the BamA barrel interacts between the periplasmic turns and POTRA 3. The figure shows that the 


with POTRA 3, 5, BamE and BamD through the periplasmic turns T-1, residues in T-1, -2 and -3 interact with residues in POTRA 5, BamD and 
-2, -3, -5, -6 and -7. a, In the BamABCDE complex, the residues of T-1, BamE. These structural data may suggest that BamB, C, D and E either 
-2 and -3 are involved in the interactions with POTRA 5, BamD and BamE. directly or indirectly control the conformation of the barrel through its 
b, Residues in T-5, -6 and -7 interact with POTRA 3 in the BamABCDE periplasmic turns. 
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Extended Data Figure 7 | BamA and BamD interactions, and 
superimposition of the BamB structures and the conformational 
changes of POTRA2 and 3. a, BamA POTRA 1 and 2 interact with the 
N-terminal domain of BamD. The interacting residues from both BamA 
and BamD are shown. b, The functional assays and the western blots were 
repeated at least three times. Functional assays of the BamA interaction 
with BamD. The mutation BamA Glu373Lys is lethal, while mutant 
Arg366Glu impairs the bacterial growth, suggesting these residues may 
have an important role in the BAM complex. 1-4 are as shown inc. ¢, 
Protein expression levels of BamA mutations were detected by western 
blotting. d, Periplasmic view of BamB of the BamABCDE complex (green) 


d Loop27 


superimposed onto the free BamB structure (orange) (PDB code 3Q7N)?! 
with a rm.s.d. of 1.81 A over 351 Ca atoms with the maximum deviation 
of 12 A at loop 19. Loops 15, 19, 23 and 27 of BamB adopt conformational 
changes to bind to POTRA 2 and 3. e, BamB of the BamABCDE complex 
superimposed onto BamB in complex with POTRA 3 and 4 (magenta) 
(PDB code 4PK1)**. The two BamB structures are very similar with a 
r.m.s.d. of 0.5860 A over 341 Ca atoms. f, Superimposition of BamABCDE 
and BamACDE at POTRA 2 and 3 witha r.m.s.d. of 3.57 A over 159 Ca 
atoms. In the BamACDE structure the hinge angle between POTRA 2 and 
3 is reduced, while POTRA 2 and 3 also separate from BamB, reducing the 
the interactions between BamB, and POTRA 2 and 3. 
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Extended Data Figure 8 | BamE interacts with BamD and BamC, and complex. BamE residues Pro67, Phe68 and BamC residues Met56 and 


BamC interactions with BamD. BamE interacts with BamA, BamD and Ile57 are shown. c, BamC forms contacts with BamA POTRA 1 in 

BamC. BamC binds extensively to the C-terminal domain of BamD. BamACDE. BamA residues Phe31, Gln35, Val39 and BamC residues Gly94 
a, BamE interacts with BamD in the BamACDE complex. BamE contacts and Arg96 are shown. d, BamC interacts with the C-terminal domain of 
the C-terminal domain residues of BamD in the BamACDE complex. BamD. The interacting residues are shown as sticks. e, BamC interacts 

b, BamE forms hydrophobic interactions with BamC in the BamACDE with the N-terminal domain of BamD. 
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Extended Data Figure 9 | Conserved residues analysis of BAM complex. _ intense colour indicates highly conserved residues. Black dashed circles 
a-d, Consurf residue conservation scores (1-9), plotted onto the represent the interaction points on removal of BamC (a), BamD (b), BamE 
molecular surfaces as a colour scale for BamA (red), BamB (green), BamC (c) and BamB (d). For each interaction patch a high density of conserved 
(blue), BamD (purple) and BamE (cyan), for the BamABCDE structure. residues is apparent. 


Regions of white/grey indicate poorly conserved residues, whereas a more 
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Extended Data Figure 10 | See next page for caption. 
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Extended Data Figure 10 | Conformational differences of the BAM 
subunits between the BamABCDE and BamACDE complexes, and BAM 
complex interacts with lipid of the OM. The subunits of BamABCDE 

are coloured as in Fig. 1, while the BamACDE subunits are in yellow. 

a, Superimposition of the BamA subunits onto the barrel domain with an 
rm.s.d. of 4.85 A over the 379 barrel Ca atoms and an r.m.s.d. of 5.76A 
over the 384 Ca atoms of the POTRA domains. The BamA barrel has 
notable conformational changes in 81C-B6C. The periplasmic POTRA 
domains rotate about 30° from BamABCDE complex to BamACDE 
complex, suggesting a novel rotation mechanism to facilitate OMP 
insertion into the OM. b, Superimposition of the BamC structures. The 
BamC structures have some conformational changes with a r.m.s.d. of 
2.102 A over 47 Ca atoms of the BamC N-terminal loop. The N-terminal 
loop Cys25-Val35 becomes more ordered in BamACDE complex. 
Particularly, the N-terminal domain and the C-terminal domain are 
ordered and bind to POTRA 1, 2 and the N-terminal domain of BamD 

in BamACDE complex. The N-terminal loops of the BamC structures 
superimpose well between residues Val35 and Pro88. c, Superimposition of 


the BamD structures with an r.m.s.d. of 1.201 A over 203 Ca atoms. 

The a-helices are conserved, but the loops have some conformational 
changes, especially loop 6 (residues Asp121-Asp136) between a-helix 

5 and a-helix 6. d, Superimposition of BamE structures with a r.m.s.d. 
of 1.721 A over 81 Ca atoms. The -strands and a-helices of BamE are 
well conserved, with minor conformational changes observed in the 
loops. e, Lipid—protein interactions for the BamACDE structure. BamB 
was modelled into the BamACDE complex by molecular modelling. 

The BamABCDE complex was built in the OM (Methods), and the 
residues interacting with lipids of the OM with 4 A are shown in putty 
representation to depict lipid interaction residues. Equivalent residues in 
all five subunits BamA (red), BamB (green), BamC (blue), BamD (purple) 
and BamE (cyan) interact with the membrane in all three independent 
simulations. f, Lipid-protein interactions for the BamABCDE structure. 
BamC was added to the BamABCDE complex by molecular modelling, 
using the solved domain from the companion complex. BamABCDE 
complex was inserted into the OMP, with lipid anchors designed 
(Methods). 
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BamACDE Se-Met** BamABCDE Nal® BamABCDE Native** 

Data collection 
Space group P4 2,2 P4,2,2 P4,2,2 
Cell dimensions 

a, b,c (A) 254.16, 254.16, 179.22 116.72, 116.72, 432.44 116.69, 116.69, 435.19 

a, B, y (°) 90.0, 90.0, 90.0 90.0, 90.0, 90.0 90.0, 90.0, 90.0 
Wavelength (A) 0.97951 1.82330 0.97949 
Resolution (A) 29.94—3.90 (4.02—3.90)* 29.86—4.00 (4.27-4.00) 49.65—2.90 (2.97-2.90) 
Rmerge (%) 28.5 (>100.0) 24.8 (>100.0) 18.0 (+100) 
CC1/2 (%) 99.9 (49.4) 100 (99.6) 99.8 (50.8) 
I/ol 11.0 (0.9) 37.0 (11.8) 15.0 (0.6) 
Completeness (%) 99.8 (100.0) 98.5 (97.8) 100 (100) 
Redundancy 27.1 (27.2) 158.00 (165.1) 26.4(23.8) 
Refinement 
Resolution (A) 29.92 — 3.90 49.65 — 2.90 
No. reflections 73745 67553 
Reactor / Rice 30.44/31.93 27.62/30.41 
No. atoms 

Protein 19796 22815 

Ligand/ion 0 0 

Water 0 0 
B-factors(A’) 

Protein 150 118 

Ligand/ion N/A N/A 

Water N/A N/A 
R.m.s. deviations 

Bond lengths (A) 0.010 0.009 

Bond angles (°) 1.868 1.609 
Residues in 
Ramachandran plot 
Favored (%) 90.5 91.6 
Allowed (%) 8.7 Tel 
Outliers (%) 0.8 0.7 
PDB code 5D0Q 5D0O 


*Values in parentheses are for highest-resolution shell. 

? Highest resolution shell was taken as point where CC1/2 > 30 along strongest reciprocal lattice direction. 

“Data statistics shown for each wavelength are a combination of two datasets (BamACDE Se-Met) and four 
datasets (BamABCDE Nal). 

Refactor = L||Fo|-|Fe||/Z|Fo|, where Fo and Fe are observed and calculated as structure factors, respectively. 

“Ree is calculated using 5% of total reflections, which is randomly selected as a free group and not used in 
refinement. 

Diffraction data for all structures were anisotropic and axis specific resolution cutoffs from AIMLESS 
(CC1/2>0.3) for refinement data basis are listed below for illustration: 

BamACDE Se-Met h-k plane = 4.45 A, l axis = 3.50 A 

BamABCDE Native h-k plane = 3.48 A, l axis = 2.75 A 
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Cosmic rays are the highest-energy particles found in nature. 
Measurements of the mass composition of cosmic rays with energies 
of 10'7-10'° electronvolts are essential to understanding whether 
they have galactic or extragalactic sources. It has also been proposed 
that the astrophysical neutrino signal! comes from accelerators 
capable of producing cosmic rays of these energies”. Cosmic 
rays initiate air showers—cascades of secondary particles in the 
atmosphere—and their masses can be inferred from measurements 
of the atmospheric depth of the shower maximum? (Xinax3 the depth 
of the air shower when it contains the most particles) or of the 
composition of shower particles reaching the ground’. Current 
measurements’ have either high uncertainty, or a low duty cycle 
and a high energy threshold. Radio detection of cosmic rays®* is 
a rapidly developing technique” for determining Xinax (refs 10, 11) 
with a duty cycle of, in principle, nearly 100 per cent. The radiation 
is generated by the separation of relativistic electrons and positrons 
in the geomagnetic field and a negative charge excess in the shower 
front®!, Here we report radio measurements of Xax With a mean 
uncertainty of 16 grams per square centimetre for air showers 


initiated by cosmic rays with energies of 10!”-10!7° electronvolts. 
This high resolution in X,,ax enables us to determine the mass 
spectrum of the cosmic rays: we find a mixed composition, with 
a light-mass fraction (protons and helium nuclei) of about 80 per 
cent. Unless, contrary to current expectations, the extragalactic 
component of cosmic rays contributes substantially to the total flux 
below 10!7° electronvolts, our measurements indicate the existence 
of an additional galactic component, to account for the light 
composition that we measured in the 10!7-10!”° electronvolt range. 

Observations were made with the Low Frequency Array (LOFAR"), 
a radio telescope consisting of thousands of crossed dipoles with 
built-in air-shower-detection capability'*, LOFAR continuously 
records the radio signals from air showers, while simultaneously 
running astronomical observations. It comprises a scintillator array 
(LORA) that triggers the read-out of buffers, storing the full wave- 
forms received by all antennas. 

We selected air showers from the period June 2011 to January 2015 
with radio pulses detected in at least 192 antennas. The total uptime 
was about 150 days, limited by construction and commissioning of the 
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Figure 1 | Energy resolution. The distribution of f,/f, (blue bars) is fitted 
with a Gaussian (red dashed curve), yielding a standard deviation of 

ao =0.12 on a logarithmic scale, which corresponds to an energy resolution 
of 32%; this value is the quadratic sum of the energy resolution of the radio 
and particle resolutions. In this analysis, there was no absolute calibration 
for the received radio power, so f, has an arbitrary scale. 


telescope. Showers that occurred within an hour of lightning activity 
or that have a polarization pattern that is indicative of influences from 
atmospheric electric fields are excluded from the sample’. 

Radio intensity patterns from air showers are asymmetric, owing to 
the interference between geomagnetic and charge-excess radiation. 
These patterns are reproduced from first principles by summing the 
radio contributions of all electrons and positrons in the shower. We 
use the radio simulation code CoREAS"®, a plug-in of CORSIKA"”, 
which follows this approach. 

It has been shown that Xx, the atmospheric depth of the shower 
maximum, can be accurately reconstructed from densely sampled 
radio measurements!®. (The atmospheric depth is the air density 
integrated over the path that the particle has travelled, starting at the 
top of the atmosphere.) We use a hybrid approach that involves simul- 
taneously fitting the radio and particle data. The radio component is 
very sensitive to Xmax, whereas the particle component is used for the 
energy measurement. 

The fit contains four free parameters: the shower core position (x, y), 
and scaling factors for the particle density (f,) and the radio power (f,). 
If f, deviates substantially from unity, then the reconstructed energy 
does not match the simulation and a new set of simulations is pro- 
duced. This procedure is repeated until the energies agree within the 
chosen uncertainties. The ratio of f, and f, should be the same for all 
showers, and is used to derive the energy resolution of 32% (see Fig. 1). 

The radio intensity fits have reduced x? values ranging from 0.9 to 
2.9. All features in the data are well reproduced by the simulation (see 
Extended Data Figs 1-5), which demonstrates that the radiation mech- 
anism is well understood. The reduced ” values that exceed unity 
could indicate uncertainties in the antenna response or the atmos- 
pheric properties that were not already accounted for, or limitations 
of the simulation software. 

Radio detection becomes more efficient for higher-altitude show- 
ers that have larger footprints (that is, larger areas on the ground in 
which the radio pulse can be detected). However, the particle trigger 
becomes less efficient because the number of particles reaching the 
ground decreases. To avoid a bias, we require that all the simulations 
produced for a shower satisfy a trigger criterion (see Methods). Above 
10!” eV, this requirement removes four showers from the sample. At 
lower energies, the number of showers excluded increases rapidly, and 
so we exclude all showers with energies less than 10'” eV from our 
analysis. 

Furthermore, we evaluate the reconstructed core positions of all 
simulated showers. Showers with a mean reconstruction error greater 
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Figure 2 | Measurements of (Ximax). Mean depth of the shower maximum 
Xmax aS a function of energy E for LOFAR, and for previous experiments 
that used different techniques*®”’. Error bars indicate 1o uncertainties. 
The systematic uncertainties are Bee cm” on (Xmax) and 27% on E, as 
indicated by the shaded band. The Pierre Auger Observatory*® measures 
the fluorescent light emitted by atmospheric molecules excited by 
air-shower particles. HiRes/MIA”’ used a combination of this fluorescence 
technique and muon detection. The Yakutsk** and Tunka”® arrays use 
non-imaging Cherenkov detectors. The green (upper) lines indicate (Xmax) 
for proton showers simulated using QGSJETII.04 (solid) and EPROS-LHC 
(dashed); the red (lower) lines are for showers initiated by iron nuclei. 


than 5 mare rejected. This criterion does not introduce a composition 
bias because it is based on the sets of simulated showers, not on the 
data. The final event sample contains 118 showers. 

The uncertainty in Xmax is determined independently for all show- 
ers!®, and has a mean value of 16gcm~ (see Extended Data Fig. 6). 
Figure 2 shows our measurements of the average Xmax» (Xmax)» Which 
are consistent with earlier experiments using different methods. The 
high resolution for X;ax per shower allows us to derive more informa- 
tion about the composition of cosmic rays, by studying the shape of 
the Xiax distribution. For each shower, we calculate a mass-dependent 
parameter: 


— (X proton) a X shower (1) 


(X proton) a (x iron) 


in which Xshower is the reconstructed Xmax, and (Xproton) and (Xion) 
are mean values of Xmax for proton and iron showers, respectively, 
predicted by the hadronic interaction code QGSJETII.04". 

The cumulative probability density function (CDF) for all showers 
is plotted in Fig. 3. First, we fit a two-component model of protons and 
iron nuclei (p and Fe), with the mixing ratio as the only free parameter. 
To calculate the corresponding CDFs we use a parameterization of the 
Xwmax distribution fitted to simulations based on QGSJETII.04. The 
best fit is found for a proton fraction of 62%, but this fit describes 
the data poorly, with p= 1.1 x 10~°. (The test statistic for this fit is 
the maximum deviation between the data and the model CDFs, and p 
represents the probability of observing this deviation, or a larger one, 
assuming the fitted composition model; see Methods.) 

A better fit is achieved with a four-component model of protons and 
helium, nitrogen and iron nuclei (p, He, N and Fe), yielding p=0.17. 
Although the best fit is found for a helium fraction of 80%, the fit qual- 
ity deteriorates slowly when replacing helium nuclei with protons. This 
is demonstrated in Fig. 4, in which p is plotted for four-component 
fits for which the fractions of helium nuclei and protons are fixed, and 
the ratio of nitrogen and iron nuclei is the only free parameter. The 
total fraction of light elements (p and He) is in the range [0.38, 0.98] 
at a 99% confidence level, with a best-fit value of 0.8. The heaviest 
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Figure 3 | Composition model fits. The cumulative probability density 
of the parameter a (see equation (1)) determined from the data (blue line; 
shading indicates the range in which p > 0.01) and several models, on 

the basis of QGSJETII.04 simulations. The set that contains only proton 
showers is centred around a= 0 and has a large spread (yellow dotted 
line), whereas iron showers give a distribution with a small spread centred 
around a= 1 (yellow dash-dotted line). A two-component model (p and Fe; 
green dashed line) yields the best fit for a proton fraction of 62%, but does 
not describe the data well (p= 1.1 x 10~®). A four-component model 

(p, He, N and Fe; red dashed line) gives the best fit with 0% protons, 79% 
helium, 19% nitrogen and 2% iron, with p= 0.17. The uncertainty on these 
values is presented in Fig. 4. 


composition that is allowed within systematic uncertainties has a 
best-fit light-element fraction of 0.6 and a 99% confidence interval of 
[0.18, 0.82]. For information about the systematic uncertainties and 
the statistical analysis, see Methods. 

The abundances of individual elements depend on the hadronic 
interaction model. The Xx values predicted by EPOS-LHC” are, on 
average, 15-20 gcm * higher than those predicted by QGSJETII.04 
(see Fig. 2). This coincides with the separation in (Xx) between, for 
example, protons and deuterium or between helium and beryllium. 
Therefore, we present our result as a total fraction of light elements, to 
avoid placing too much emphasis on individual elements. 

Recent results from the Pierre Auger Observatory? indicate that 
the composition of cosmic rays at 10!% eV, just below the ‘ankle’ 
(a hardening of the all-particle cosmic-ray spectrum), can be fitted 
with a mixture of protons and either helium (QGSJET.II04) or nitrogen 
(EPOS-LHC). As the energy decreases, the proton fraction of the 
cosmic-ray composition decreases while the helium (or nitrogen) 
fraction increases, down to the threshold energy of 7 x 10'’eV. An 
extrapolation of this trend to our mean energy of 3 x 10!” eV connects 
smoothly to our best-fitting solution in which helium dominates. 

An ‘ankle’-like feature in the cosmic-ray energy spectrum at 10'7! eV 
has been measured‘ at the KASCADE-Grande experiment, at which 
the spectral index for light elements changes to y= —2.79 + 0.08. 
However, the light particle (p and He) fraction is found to be less than 
30% at 3 x 10!” eV (on the basis of figure 4 in ref. 4), which is consid- 
erably lower than our value. In contrast to LOFAR, the composition 
measurements presented in ref. 4 are based on the muon/electron 
ratio. A muon excess compared to all commonly used hadronic inter- 
action models was reported”!. Inaccurate predictions of muon produc- 
tion, or (Xmax), could be the cause of the discrepancy in the fraction of 
light particles predicted by LOFAR and KASCADE-Grande. 

If the ‘knee in the all-particle cosmic-ray spectrum (a steepening 
near 3 x 10'SeV) corresponds to the proton or helium cut-off of the 
main galactic cosmic-ray population, then the corresponding iron 
cut-off would lie at an energy of at most 26 times larger. If the main 
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Figure 4 | p-value distribution for the four-component model. The 
four-component model is explored further by fixing the proton and 
helium fractions at all possible combinations, and solving for the nitrogen/ 
iron ratio. The p value (see Fig. 3) is plotted as a function of the proton 
and helium fractions. The optimal fit (largest p value) is found for 0% 
protons and 79% helium (p = 0.17), but the deviation deteriorates slowly 
when replacing helium with protons. The black contour line bounds 

all combinations for which p > 0.01. At this significance level, the total 
fraction of light elements (p and He) lies between 0.38 and 0.98. 


population of galactic cosmic-ray sources still dominates at 10!” eV, 
then the mass composition of the cosmic rays should be dominated by 
heavy elements at that energy. Therefore, the large component of light 
elements observed with LOFAR must have another origin. 

In principle, it is possible that we observe an extragalactic compo- 
nent. In that case, the ‘ankle’ in the cosmic-ray spectrum, at energies 
slightly greater than 10!% eV, does not indicate the transition from 
galactic to extragalactic origin. Instead, it can be explained as the 
imprint of pair production on the cosmic microwave background on 
an extragalactic proton spectrum**. However, because this feature only 
appears for a proton-dominated flux it is contrary to our data that 
indicate a mixture of light elements. 

A second galactic component, dominating around 10!” eV, could be 
produced by a class of extremely energetic sources (galactic exatrons), 
such as the explosions of Wolf Rayet stars into their stellar winds” or 
past galactic gamma-ray bursts”*. Alternatively, the original galactic 
population could be reaccelerated by the galactic-wind-termination 
shock?°. Such scenarios predict mixtures of light elements, consistent 
with our results. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Event selection. Cosmic ray detection at LOFAR runs continuously in the back- 
ground during astronomical observations. When 16 out of the 20 scintillator sta- 
tions of the LORA particle array detect a signal, a ‘trigger’ is issued and the ring 
buffers of all active antennas within about a 1-km radius are stored for offline 
analysis!*. Which antennas are active depends on the settings of the astronomical 
observation. For this analysis, we selected showers that were measured with at least 
four antenna stations (corresponding to at least 192 antennas) in the low band 
(30-80 MHz after filtering). 

The trigger and selection criteria introduce a composition bias. This bias is 
removed by excluding certain showers on the basis of dedicated sets of simulations 
that are produced for each observed shower. Each of these sets contains 50 proton 
and 25 iron showers that span the whole range of possible shower depths. A shower 
is only accepted if all simulations in its set satisfy the trigger and selection criteria. 
This anti-bias exclusion removes many showers below 10" eV, but only four above 
that energy. Consequently, we restrict our analysis to the higher-energy showers, 
imposing a minimum bound on the reconstructed shower energy of Ereco= 10!’ eV. 

Imposing this energy bound introduces another potential source of composi- 
tional bias, because the reconstructed energy might depend on the depth of the 
shower. However, in our reconstruction approach, this effect is very small because 
energy and X,x are fitted simultaneously. Extended Data Fig. 7 shows distributions 
of the ratio between true and reconstructed energy for proton and iron simulations. 
The systematic offset between the two particle types is of the order of 1%. 

We used data from the Royal Netherlands Meteorological Institute to check 
for lightning-storm conditions during our observations. When lightning strikes 
were detected in the north of the Netherlands within an hour from a detection, 
the event is flagged and excluded from the analysis. The presence of electric fields 
in the clouds can severely alter the radio emission even in the absence of lightning 
discharges*’. The polarization angle of the radio pulse is very sensitive to the nature 
of the emission mechanism!>"! and is used as an additional veto against strong 
field conditions. 

Finally, a quality criterion is imposed on the sample so that only showers that 
have a core position and arrival direction that allows accurate reconstruction are 
included. We use the dedicated sets of simulations produced for each shower to 
derive uncertainties on core position, energy and Xmax. These three values are 
highly correlated, so a single criterion based on the core uncertainty of Ocore< 5m 
is sufficient. 

The quality criterion is based on the dedicated sets of simulations. These sets 
are produced for a specific combination of core position and arrival direction. 
Therefore, the quality criterion is effectively a criterion on position and direction, 
and does not introduce a composition bias. 

There is no criterion on the quality of the reconstruction of the actual data. By 

applying the criteria described above we obtain a sample of 118 showers that are 
fitted to the simulation yielding reduced \” values in the range 0.9-2.9. Deviations 
from unity can be ascribed to uncertainties in antenna response, atmospheric prop- 
erties such as the index of refraction, or limitations of the simulation software. 
Reconstruction. The energy and Xmax of the shower are reconstructed with the 
technique described in ref. 18. 
Statistical uncertainty. The statistical uncertainty on the power measurements of 
individual antennas includes three contributions. First, there is contribution from 
the background noise, which is a combination of system noise and the galactic 
background. Second, there is a contribution from uncertainties in the antenna 
response model. There can be differences between the responses of antennas, 
either because of antenna properties (for example, cross-talk between nearby 
antennas) or because of signal properties (for example, polarization). Because 
these fluctuations are different for each shower core position and arrival direction, 
they are essentially random and so are included as a 10% statistical uncertainty 
on the power. Third, there is a contribution due to the error introduced by inter- 
polating the simulated pulse power. Strictly speaking this is not a measurement 
uncertainty, but it must be taken into account when fitting the data to simulation. 
The interpolation error is of the order of 2.5% of the maximum power'®. The three 
contributions are added in quadrature and produce the lo error bars shown in 
Extended Data Figs 1-5. 

The statistical uncertainty on Xmax is given by the quadratic sum of the uncer- 
tainties due to the reconstruction technique and the atmospheric correction. The 
former is found by applying our analysis to simulated events with added Gaussian 
noise, where the noise level is determined from the data. 

In the CORSIKA simulations, the standard US atmosphere model was used. The 
reconstructed shower depth is corrected for variations in the atmosphere using 
data from the Global Data Assimilation System (GDAS) of the NOAA National 
Climatic Data Center. We follow a previously developed procedure*, which typi- 
cally leads to adjustments of the order of 5-20gcm~*. The remaining uncertainty 


after correction is of the order of 1 gcm~”. 


The refractive index of air is a function of temperature, air pressure and relative 
humidity. Using local weather information, the final data were split in two groups 
of equal size, corresponding to conditions with relatively high or low refractive 
index. The mean reconstructed Xjax of these two subsets deviate from that of the 
total sample by +5 gcm ?; we adopt this value as an additional statistical uncer- 
tainty. Because the refractivity used in simulation corresponds to dry air, there is 
also an associated systematic error (see below). 

The total statistical uncertainty on Xmax is found by adding the above factors in 
quadrature. A distribution of the uncertainty for the showers in our final sample 
is shown in Extended Data Fig. 6. 

The energy resolution is 32% and is found by comparing energy scaling factors 
of the radio power and particle density fit (see Fig. 1). 

Systematic effects. The data have been subjected to several tests (outlined below) 
to determine the systematic uncertainty on the reconstructed values for Xmax- 
Zenith-angle dependence. The final data are split into two groups of equal size by 
selecting showers with a zenith angle below or above 32°. For both groups, the 
mean reconstructed Xmax is calculated, yielding deviations from the mean value 
of the complete sample of +8 gcm~*. This spread is larger than is expected from 
random fluctuations alone and is included as a systematic uncertainty. The depend- 
ence on zenith angle may be related to atmospheric uncertainties (see below). 
Refractive index of air. As explained above, the refractive index changes because 
of differences in atmospheric conditions. Fluctuations in Xmax due to changing 
humidity are of the order of 5 gcm ? with respect to the mean. However, the refrac- 
tive index that was used in the radio simulations corresponds to dry air, and is a 
lower bound on the actual value. Therefore, the real value of Xmax can be higher 
than the reconstructed value, but not lower; we adopt an asymmetric systematic 
uncertainty of +10gcm~?. 

Hadronic interaction model. Because the reconstruction technique is based on full 
Monte Carlo simulations, it is sensitive to the choice of hadronic interaction model 
that is used. A comparison between QGSJETII.04, SYBILL 2.1 and EPOS-LHC, 
revealed that the uncertainty due to model dependence is about 5 gcm~*. The 
uncertainty on the composition due to different models (in other words, on how 
to interpret the measured Xmax Values) is larger. 

Radiation code. For this analysis we used the radiation code CoREAS, in which the 
contributions of all individual charges to radiation field are added together. The 
advantage of this microscopic approach is that it is completely model-independent 
and based on first principles. ZHAireS*? is another microscopic code, which gives 
very similar results**. To calculate the emission, CoREAS uses the end-point for- 
malism*’, whereas ZHAireS is based on the ZHS algorithm*®. Both formalisms are 
derived directly from Maxwell’s equations and have been shown to be equivalent*”. 
The other difference between CoREAS and ZHAires is that they take the particle 
distribution from different air-shower propagation codes (CORSIKA and AIRES, 
respectively) that internally use different hadronic interaction models. Because 
the radiation formalisms themselves are equivalent, small differences between 
CoREAS and ZHAireS are probably due to differences in the hadronic interaction 
models used to simulate the particle interactions. Therefore, the choice of radiation 
code does not introduce additional systematic uncertainty on top of the uncertainty 
due to hadronic interaction models that is already included. A comparison study 
with LOFAR data did not show any evidence for a systematic offset between the 
codes (S.B. et al., in preparation). 

The remaining small dependence of X;ax on zenith angle is possibly related to 
the refractive index. Showers with different inclination angles have their shower 
maximum at different altitudes and, therefore, different local air pressures and 
refractive indices. Consequently, increasing the refractive index used in simulations 
will result in a zenith-dependent change in reconstructed Xmax. This could poten- 
tially remove the observed dependence of the composition on zenith angle. 
Correctly taking into account a complete atmospheric model for the profile of the 
refractivity of air is subject of further study. Here, we treat the effect conservatively 
by linearly adding the first two contributions to the uncertainty. The other two 
contributions are independent and are added in quadrature, yielding a total 
systematic uncertainty of 8 cm”?. 

The systematic uncertainty in the energy reconstruction with the LORA particle 
detector array is 27%, which includes effects due to detector calibration, hadronic 
interaction models and the assumed slope of the primary cosmic-ray spectrum in 
the CORSIKA simulations***®. 

Statistical analysis. For each observed shower, we calculate a using equation (1): 


(Xproton) — X shower 
(Xproton) — (Xiron) 


a= 


in which Xshower is the reconstructed Xmax, and (Xproton) and (Xjron) are mean values 
predicted by QGSJETII.04"°. Therefore, a is an energy-independent parameter that 
is mass sensitive. A pure proton composition results in a wide distribution of a 
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centred around zero, whereas a pure iron composition would results in a narrower 
distribution centred around one. 

From the measurements we construct a cumulative distribution function (CDF) 
using the following Monte Carlo approach. A realization of the data is made by 
taking the measured values for the energy and Xmax, adding random fluctuations 
based on the statistical uncertainty of these parameters, and calculating a and the 
corresponding CDF. By constructing a large number of realizations with different 
random fluctuations, we calculate the mean CDF and the region that contains 
99% of all realizations. These are indicated in Fig. 3 as the solid blue line and the 
shaded region, respectively. 

We fit theoretical CDFs on the basis of compositions with two or four mass 
components to the data. The test statistic in the fit is the maximum deviation 
between the data and the model CDFs. The p value represents the probability of 
observing this deviation, or a larger one, assuming the fitted composition model. 

We first use a two-component model of proton and iron nuclei, in which the 
mixing ratio is the only free parameter. The best fit is found for a proton fraction 
of 62%, but it describes the data poorly, with p value of 1.1 x 10°. 

A better fit is achieved with a four-component model (p, He, N and Fe), yielding 
p=0.17. Although the best fit is found for a helium fraction of 80%, the fit quality 
deteriorates slowly when replacing helium by protons. This is demonstrated in Fig. 4, 
in which p is plotted for four-component fits with the fractions of helium and 
proton fixed, and the ratio between nitrogen and iron is the only free parameter. 
The solid line in Fig. 4 bounds the parameter space in which p > 0.01. We construct 
a 99% confidence interval on the total fraction of light elements (p and He) by 
finding the two extreme values of this fraction that lie within the p > 0.01 region. 

The total fraction of light elements (p and He) is in the range [0.38, 0.98] at the 
99% confidence level, with a best fit value of 0.8. The heaviest composition that is 
allowed within systematic uncertainties (see above) has a best-fit p+ He fraction 
of 0.6 and a 99% confidence interval of [0.18, 0.82]. 
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Code availability. Data analysis was done with PyCRTools. PyCRTools is free 
software, available from http://usg.lofar.org/svn/code/trunk/src/PyCRTools, 
which can be redistributed and/or modified under the terms of the GNU General 
Public License as published by the Free Software Foundation, either version 3 of 
the License or any later version. 
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Extended Data Figure 1 | Fitted lateral distributions. Lateral distribution axis of the air shower. Therefore, a value of 0 corresponds to an antenna 


of radio-pulse power for all 118 measured showers (red circles) and that is located at the position where the shower axis reaches the ground. 
the corresponding best-fitting CoREAS simulation (blue squares). The The ID numbers are unique values that are used to label the detected 
distance to the shower axis is the distance between the antenna and the air showers. a.u., arbitrary units. 
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Extended Data Figure 3 | Fitted lateral distributions. Continuation of Extended Data Fig. 2. 
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Extended Data Figure 5 | Fitted lateral distributions. Continuation of Extended Data Fig. 4. 
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distribution of the uncertainty on Xmax for all showers used in this analysis. 


The mean value is 16gcm?. 
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Extended Data Figure 7 | Energy reconstruction. Distributions of the 
ratio between true (E;rue) and reconstructed (Eyeco) energy for proton (blue 
solid line) and iron (red dashed line) showers. The two types of showers 
have a systematic offset of the order of 1%. 
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Controlling spin relaxation with a cavity 


A. Bienfait!, J. J. Pla2t, Y. Kubo!, X. Zhou!, M. Stern!4, C. C. Lo2, C. D. Weis®, T. Schenkel®, D. Vion!, D. Esteve, 


J.J. L. Morton? & P. Bertet! 


Spontaneous emission of radiation is one of the fundamental 
mechanisms by which an excited quantum system returns to 
equilibrium. For spins, however, spontaneous emission is generally 
negligible compared to other non-radiative relaxation processes 
because of the weak coupling between the magnetic dipole and 
the electromagnetic field. In 1946, Purcell realized! that the rate 
of spontaneous emission can be greatly enhanced by placing the 
quantum system in a resonant cavity. This effect has since been used 
extensively to control the lifetime of atoms and semiconducting 
heterostructures coupled to microwave’ or optical*“ cavities, and 
is essential for the realization of high-efficiency single-photon 
sources°. Here we report the application of this idea to spins in solids. 
By coupling donor spins in silicon to a superconducting microwave 
cavity with a high quality factor and a small mode volume, we reach 
the regime in which spontaneous emission constitutes the dominant 
mechanism of spin relaxation. The relaxation rate is increased 
by three orders of magnitude as the spins are tuned to the cavity 
resonance, demonstrating that energy relaxation can be controlled 
on demand. Our results provide a general way to initialize spin 
systems into their ground state and therefore have applications in 
magnetic resonance and quantum information processing®. They 
also demonstrate that the coupling between the magnetic dipole 
of a spin and the electromagnetic field can be enhanced up to the 
point at which quantum fluctuations have a marked effect on the 
spin dynamics; as such, they represent an important step towards 
the coherent magnetic coupling of individual spins to microwave 
photons. 

Spin relaxation is the process by which a spin reaches thermal equi- 
librium by exchanging an energy quantum hw, with its environment 
(where fh is the reduced Planck constant and w, is the resonance fre- 
quency of the spin), for example in the form of a photon or a phonon, 
as shown in Fig. la. Understanding and controlling spin relaxation is 
essential in applications such as spintronics’, quantum information 
processing®, and magnetic resonance spectroscopy and imaging”. For 
such applications, the spin relaxation time T; must be sufficiently 
long to permit coherent spin manipulation; however, if T; is too 
long, it becomes a major bottleneck that limits the repetition rate of 
an experiment, which in turn affects factors such as the achievable 
sensitivity. Certain types of spins can be actively reset to their ground 
state by optical’® or electrical'! means, owing to their specific energy- 
level scheme, and methods such as chemical doping have been used 
to influence spin relaxation times ex situ’. Nevertheless, an efficient, 
general and tunable initialization method for spin systems is still 
currently lacking. 

At first inspection, spontaneous emission would appear unlikely to 
influence spin relaxation: for example, an electron spin in free space and 
at a typical frequency of w,/(2T) + 8 GHz spontaneously emits photons 
at arate of about 10~!*s~!. However, the Purcell effect provides a way 
to markedly enhance spontaneous emission and thus gain precise and 


versatile control over spin relaxation’. Consider a spin embedded in 
a microwave cavity of quality factor Q and frequency w». If the cavity 
damping rate & = w»/Q is greater than the spin-cavity coupling g, then 
the cavity provides an additional channel for spontaneous emission of 
microwave photons, governed by the Purcell rate*!? 


2 


_ § 
K?/4+ 67 


(1) 


in which 6= wp — w, is the spin-cavity detuning (see Fig. la and 
Methods). This cavity-enhanced spontaneous emission can be much 
larger than in free space, and is strongest when the spins and cavity are 
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Figure 1 | Purcell-enhanced spin relaxation and experimental set-up. 

a, By placing a spin in a resonant cavity, radiative spin relaxation can be 
made to dominate over intrinsic processes such as phonon-induced 
relaxation. b, Top, a planar superconducting resonator with frequency 

wy = 1//LC consisting of an interdigitated capacitor (black; with a 
capacitance C) in parallel with an inductive wire (green; with an 
inductance L) is fabricated on top of Bi-doped **Si. A static magnetic field 
Bo is applied parallel to the x-y plane of the 50-nm-thick aluminium layer, 
with a tunable orientation 6. Bottom, magnetic field lines of the microwave 
excitation field B, generated by the aluminium wire (arrows) are 
superimposed over the local concentration of Bi donors (red), obtained by 
secondary ion mass spectrometry (SIMS). c, The sample is mounted ina 
copper box that is thermally anchored at 20 mK, and probed by microwave 
pulses via asymmetric antennae that are coupled with rate | © K2/5 to the 
resonator. Microwave pulses at wo of power P;, are sent by antenna 1, and 
the microwave signal leaving via antenna 2 is directed to the input of a 
Josephson parametric amplifier (JPA). 
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on resonance (6=0): [p= 4g"/ «. Furthermore, the Purcell rate can be 
modulated by changing the coupling constant or the detuning, allowing 
spin relaxation to be tuned on demand. 

Although the Purcell effect was used to detect spontaneous emission 
of radiofrequency radiation from nuclear spins coupled to a resonant 
circuit!, the corresponding Purcell rate Ip 107 !°s~! (or 1 photon 
emitted every 300 million years) was negligible compared to the intrin- 
sic spin-lattice relaxation processes. For photon emission to become 
the dominant spin-relaxation mechanism, both a large spin-cavity cou- 
pling and a low cavity damping rate are needed; in our experiment, this 
is achieved by combining the microwave confinement provided by a 
micrometre-scale resonator with the high quality factors achieved by 
using superconducting circuits. 

The device consists of two planar aluminium lumped-element super- 
conducting resonators (denoted A and B) patterned onto a silicon chip 
that was purified in nuclear-spin-free 7°Si and implanted with bismuth 
atoms (see Fig. 1b) at a sufficiently low concentration for collective radi- 
ation effects to be absent. A static magnetic field Bo is applied in the 
plane of the aluminium resonators, at an angle 6 from the resonator 
inductive wire, tunable in situ. The device is mounted inside a copper 
box and cooled to 20 mK. Each resonator can be used to perform 
inductive detection of the electron-spin resonance (ESR) signal of the 
bismuth donors: microwave pulses at wo are applied at the resonator 
input, generating an oscillating magnetic field B) around the inductive 
wire that drives the surrounding spins; the quantum fluctuations of this 
field, present even when no microwave is applied, are responsible for 
the Purcell spontaneous emission. Hahn echo pulse sequences!» are 
used, resulting in the emission of a spin-echo in the detection wave- 
guide, which is amplified with a sensitivity reaching the quantum limit 
by a Josephson parametric amplifier!® before demodulation at room- 
temperature, yielding the integrated echo signal quadrature Ag (see 
Methods). A more detailed description of the set-up is found in ref. 17. 

Bismuth is a donor in silicon'® with a nuclear spin [= 9/2. At cryo- 
genic temperatures it can bind an electron (with spin S = 1/2) in addi- 
tion to those shared with the surrounding Si lattice. The large hyperfine 
interaction AS - I between the electron and nuclear spin (in which $ 
and I are the electron and nuclear spin operators, and A/h = 1.475 GHz 
with h the Planck constant) produces a splitting of 7.375 GHz between 
the ground and excited multiplets at zero magnetic field (see Fig. 2a 
for the complete energy diagram!”). This splitting makes the system 
ideal for coupling to superconducting circuits*®”’. At low fields 
(Bo < 10 mT, compatible with the critical field of aluminium), all 
Ampr= +1 transitions are allowed, where mp is the projection of the 
total spin (F =I+ S) along Bo. Considering only the transitions with 
largest matrix element, resonator A (wo,a/(27) = 7.245 GHz, 
Qa = 3.2 x 10°) crosses the |F, mp) = |4, —4)<|5, —5) transition, 
whereas resonator B (w9,3/(2%) =7.305 GHz, Qg=1.1 x 10°) crosses the 
transitions |4, —4)<>|5, —5), |4, —3)<+|5, —4) and |4, —2)<>|5, —3) 
(see Fig. 2a, b). 

The echo signal Ag from each resonator as a function of By shows 
resonances at the expected magnetic fields, split into two peaks each 
with a full-width at half-maximum of Aw/(217) +2 MHz (see 
Fig. 2a). As is explained in ref. 17, this splitting is believed to be the 
result of strain induced in the silicon at the donor implant depth of 
approximately 100 nm by the aluminium circuit deposited on the sur- 
face. In the following, we focus on the lower-frequency peak of the 
|4, —4)<>|5, —5) line, which corresponds to spins lying under the wire. 
Over the region occupied by these spins, the amplitude of the B, field 
varies by less than +2%, as evidenced by the well-defined Rabi oscilla- 
tions observed when we sweep the power of the refocusing pulse Pj, at 
the cavity input (see Fig. 2c), which allows us to determine the input 
power ofa 7 pulse for a given pulse duration. 

We measure the relaxation time T by performing an “inversion- 
recovery” experiment” (see schematic in Fig. 2d), with the static field 
Bo aligned along x (9=0). A x pulse first inverts the spins whose 
frequencies lie within the resonator bandwidth k4/(27) =23 kHz or 
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Figure 2 | ESR spectroscopy and Purcell-limited T,; measurement. 

a, Top, dominant electron spin resonance transitions of the Si:?”°Bi spin 
system (see Methods). We use two resonators, A (green) and B (brown), 
with frequencies of 7.246 GHz and 7.305 GHz, respectively, that cross 

up to three spin transitions in the magnetic field range 0-6 mT, as seen 

in the echo-detected magnetic field sweep (bottom; vertically offset 

for clarity). Subsequent spin relaxation measurements were made 

at the magnetic fields indicated by the arrows, corresponding to the 

|F, mp) =|4, —4)|5, —5) transition for each resonator. The doublet 
structure of each transition is caused by strain exerted by the aluminium 
film on the donors”. b, Cavity linewidths for resonators A and B are found 
to be 23 kHz and 68 kHz, respectively, from fits (solid lines) to their 
measured transmission amplitude. c, Rabi oscillations are driven by 
varying the cavity input power of the refocusing 7 pulse (5 1s long) applied 
T = 300s after the first 7/2 pulse. Solid lines are exponentially damped 
sinusoidal fits. d, The inversion-recovery sequence is used to measure the 
spin relaxation time T). Spin polarization is measured with a Hahn echo 
sequence. Aq is rescaled by its value for T >> T, (‘Aq(0oo)’) such that it varies 
from —1 when the spins are fully inverted to +1 at thermal equilibrium 
(see Methods for full sequence description). Data were obtained with the 
static field By parallel to the inductor (@ = 0). Solid lines are exponential 
fits to the data with time constant T,. The uncertainty is provided by the 
standard deviation in the exponential fit parameters. a.u., arbitrary units. 
In all panels, the symbols represent data for each resonator (A, green 
squares; B, brown circles). 


Kp/(2T) = 68 kHz; this constitutes a small subset of the total number 
of spins because kan < Aw. After a varying delay T, a Hahn echo 
sequence provides a measure of the longitudinal spin polarization. 
By fitting the data with decaying exponentials, we extract T; =0.35s 
for resonator A and T; = 1.0s for resonator B. 

To quantitatively compare our results with the expected Purcell rate, 
it is necessary to evaluate the spin-resonator coupling constant 
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g=7,(F, mp|Sx|F +1, mp —1)||6B, |} in which y¢/(27) +28 GHzT! is 
the electronic gyromagnetic ratio, S, is the dimensionless Pauli operator 
for the electron spin and 6B, is the component of the resonator- 
field vacuum fluctuations orthogonal to Bo (see Methods). 
A numerical estimate yields go/(27) = 56 + 1 Hz for the spins located 
below the inductive wire in the resonator that are probed in our meas- 
urements, and for =0. An independent estimate is obtained by meas- 
uring Rabi oscillations: their frequency 2g = 2g,-/f directly yields go 
given knowledge of the average intra-cavity photon number #, which 
can be determined with about 30% imprecision from P;, and the meas- 
ured resonator coupling to the input and output antennae (see 
Methods). Using this method, we obtain go/(27) =50 +7 Hz for reso- 
nator A and 58 + 7 Hz for resonator B, compatible with the numerical 
estimate. The corresponding Purcell time of the resonant spontaneous 
emission is /°>'= 0.36 +0.09 s for resonator A and ’;'=0.81+0.17s 
for resonator B, in agreement with the experimental values. 

According to equation (1), a Purcell-limited T, should be strongly 
dependent on the spin-cavity detuning. We introduce a pulse in the 
magnetic field of duration T between the spin excitation and the spin- 
echo sequence (see Fig. 3a), which results in a temporary detuning 6 of 
the spins. The amplitude of the echo signal Ag as a function of T yields 
their energy relaxation time while they are detuned by 6. To minimize 
the influence of spin diffusion’, the spin excitation is performed by a 
high-power long-duration saturating pulse (see Fig. 3a and Methods) 
instead of an inversion pulse as in Fig. 2d. As is evident in Fig. 3b, we 
find that the decay of the echo signal is well fitted by a single exponen- 
tial with a decay time that increases with |6|. The extracted T,(6) curve 
(see Fig. 3c) shows an increase in T| of up to three orders of magnitude 
when the spins are detuned away from resonance, until it becomes 
limited by a non-radiative energy decay mechanism with characteristic 
time yk =1, 600 +300 s. Given the doping concentration in our sam- 
ple, this non-radiative decay time is consistent with earlier measure- 
ments of donor spin relaxation times”, which have been attributed to 
charge hopping, but it could also arise here from spatial diffusion of the 
spin magnetization away from the resonator mode volume. It is shown 
in Fig. 3c that the T;(6) measurements are in agreement with the 
expected dependence (Jp(6) + Ty NR) 1, with Ir the only free param- 
eter in this fit. 

Having demonstrated the effect of cavity linewidth and detuning on 
the Purcell rate, we explore the effect of modulating the spin-cavity 
coupling constant g. This can be achieved by varying the orientation 0 
of the static magnetic field Bo in the x-y plane (Fig. 1b), which adjusts 
the component of the microwave magnetic field (B,, which is mostly 
aligned along y under the inductive wire) that is orthogonal to By. More 
precisely 


(0) =4,(F, mr|S.|F +1, mp — 1) 8By,ycos?(8) + 8By,, (2) 


(since 6B,,=0). This orientation dependence is verified experimentally 
by measuring the Rabi frequency as a function of 6, as shown in 
Fig. 4a, b, which allows us to extract g(0)/(27) =58 Hz and 
g(1/2)/(2%) = 17 Hz. As expected, we measure longer spin relaxation 
times for increasing values of 0, as shown in Fig. 4c, with the relaxation 
rate ‘i. scaling as [g(0)]°, in agreement with equation (1). Overall, the 
data in Figs 3 and 4 demonstrate unambiguously that cavity-enhanced 
spontaneous emission is by far the dominant spin-relaxation channel 
when the spins are resonant with the cavity, because the probability of 
a spin-flip occurring as a result of emission of a microwave photon in 
the cavity is 1/[1 + INr/Ip(6=0)] =0.999, very close to unity. 

The spontaneous emission evidenced here is an energy-relaxation 
mechanism that does not require the presence of a macroscopic mag- 
netization to be effective. Under the Purcell effect, each spin inde- 
pendently relaxes towards thermal equilibrium by microwave photon 
emission, so that when no intra-cavity thermal field is present, the 
sample ends up in a fully polarized state after a time longer than I>, 
regardless of its initial state. This is in stark contrast to the well-known 
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Figure 3 | Controlling Purcell relaxation by spin-cavity detuning. 
a, In between their saturation and subsequent read-out, the spins are 


detuned from the cavity by 6 = “ Bs by applying a magnetic-field pulse 


with an amplitude of B;, with Lee ~ 25 GHz T~! for this transition and 
TT 


magnetic field. b, Measured spin-polarization decays (symbols) for four 
different detunings 6, which are well fitted by exponential decays (lines), 
with relaxation time constants T) increasing with the detuning (error bars 
indicate the standard deviation of a measured echo). c, Measured T, asa 
function of detuning 6 (blue symbols). The red line is a fit with 

(Ip(6) + Tyr) |, yielding Py =1,600 s. Error bars are estimates of the 
standard deviation of the fit. These measurements are taken using 
resonator B and with 6 = 77/4, which results in T; = 1.7s at 6=0. 


phenomenon of radiative damping” of a transverse magnetization 
generated by earlier microwave pulses, which is a coherent collective 
effect under which the degree of polarization of a sample cannot 
increase. Had our device possessed a larger spin concentration, spon- 
taneous relaxation would have occurred collectively, manifesting itself 
as a non-exponential decay of the echo signal on a timescale faster than 
I)" (ref. 13), and leading to an incomplete thermalization®”. The exist- 
ence of such super-radiant or maser emission”® requires the dimen- 
sionless ‘co-operativity’ parameter C= Ne’/(KAw) (where Nis the total 
number of spins) to satisfy C >>1 (refs 6, 25, 27), which is not the case 
here because of the large inhomogeneous broadening of the spin reso- 
nance caused by strain. 

Our demonstrated ability to modulate spin relaxation through three 
orders of magnitude by changing the applied field by less than 0.1 mT 
opens up new perspectives for spin-based quantum information pro- 
cessing: long intrinsic relaxation times, which are desirable to maximize 
the spin coherence time, can be combined with fast, on-demand initial- 
ization of the spin state. Similarly, performing electron spin resonance 
at dilution refrigerator temperatures can be prohibitively slow without 
the ability to accelerate spin relaxation on demand. We also anticipate 
that Purcell relaxation will offer a powerful approach to dynamical 
nuclear polarization”*”’, for example, by tuning the cavity to match 
an electron-nuclear spin flip-flop transition, enhancing the rate of 
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Figure 4 | Dependence of Purcell relaxation on spin-cavity coupling g. 
a, Rabi oscillations (as in Fig. 2c) measured as a function of field 
orientation 6 (see Fig. 1b); the colour scale indicates the echo amplitude in 
arbitrary units. b, The Rabi oscillations in a are used to extract the spin- 
cavity coupling strength g (blue symbols; error bars are determined by 

the 30% accuracy on Pj,,). These data are fit to equation (2) (red line); the 
non-zero value of g(1/2) is due to the finite out-of-plane component of the 
microwave magnetic field. c, Inversion-recovery measurements (error bars 
indicate the standard deviation of a measured echo) for different values of 
@ confirm that the relaxation time T} (see inset; error bars are estimates of 
the standard deviation of the fit) varies as [g(0)]?. The red line in the inset 
is the Purcell formula predicted using the g(@) dependence fitted from b. 
All data were collected using resonator B. 


cross-relaxation to pump polarization into the desired nuclear spin 
state*” (see Methods). The Purcell rate we obtain could be increased 
by reducing the transverse dimensions of the inductor wire to yield 
larger coupling constants (up to 5-10 kHz), which would reduce the 
spontaneous emission time to less than 1 ms (thus permitting faster 
repetition rates and a higher sensitivity’’), allowing for the possibility 
of high-co-operativity coupling of a single spin to the microwave cavity 
field. Our measurements constitute evidence that vacuum fluctuations 
of the microwave field can affect the dynamics of spins, and, there- 
fore, are a step towards the application of concepts in circuit quantum 
electrodynamics to individual spins in solids. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Bismuth donors in silicon. Bismuth donors in silicon have the following isotropic 
spin Hamiltonian™: H/h =B- (7,8 @ 1—7,1@ I) + AS - I, in which the electronic 
gyromagnetic ratio y./(2) = 27.997 GHzT ', the nuclear gyromagnetic ratio 
n/ (2%) =6.9 MHzT! and the hyperfine coupling constant A/h= 1.475 GHz. For 
a weak static field Bp (By < 50 mT) oriented along x, the eigenstates of the total 
angular momentum F=S + I and its projection mz along By represent good quan- 
tum numbers for the 20 electro-nuclear energy states of the Bi:Si system. These 
eigenstates can be grouped in an F=4 ground and an F=5 excited multiplet sep- 
arated by a frequency of (I+ 1/2)A/h=7.35 GHz in zero-field (see Fig. 1d). 
Transitions between states that verify AFAmp=-+1 can be excited with a field 
orientated along y (or z) because their associated matrix element 
(F, mp|Sy|F +1, mp +1) =(F, mp|Sz|F + 1, mp + 1) has the same magnitude as an 
ideal electronic spin 1/2 transition (m.|S,|my) =0.5. Only the ten transitions with 
a matrix element greater than 0.25 are shown in Fig. 2a. Characteristics for the 
transitions probed by our resonators are given in Extended Data Table 1. 
Single-spin coupling to the resonator. The spin-resonator interaction is described 
by a Jaynes-Cummings Hamiltonian, hig(a'o_ + ac.,), in which a (a") is the field 
annihilation (creation) operator, 7_ (0) is the spin lowering (raising) operator 
and g is the spin-resonator coupling strength. For the Bi:Si transitions, 
|F, mp)<+|F +1, mp —1) probed by the resonators, g can be expressed as 
g= AF, me|Sx|F +1, mp —1)||$B,|| (ref. 17), in which 6B, is the component of 
the resonator-field vacuum fluctuations orthogonal to By. Considering the orien- 
tations for By and 6B shown in Fig. 1b, we obtain equation (2) 


g(9) = 7(F, me|Sx|F +1, mp —1),/8B;,, cos(0)? + 8B; 


To estimate the distribution of the coupling constant g(0) for a given transition, 
we need to estimate the vacuum-field fluctuations 6B; in the spin ensemble region. 
This is achieved using the COMSOL software and assuming a non-homogeneous 
current-density distribution in the superconducting aluminium wire*. The total 
current flowing through the wire cross-section is §i = wo./h/2Zo , in which 
Zo = .{L/C is the resonator impedance, determined to be 44) via electromagnetic 
simulations realized in CST Microwave Studio. In all the work presented in the 
main text, the measurements were done on the low-field peak of transition 
|F, mp) =|4, -4)<|5, —5), which has been attributed to spins residing under the 
wire. From the spin implantation profile (see Fig. 1b) and the spatial dependence 
of the microwave field 5B, restrained to the area under the wire (|y| < 2.5|.m), the 
relevant coupling-constant distribution can be extracted. Doing so yields a very 
asymmetric coupling distribution that is sharply peaked around g/(2%) = 56 Hz 
with a 2-Hz full-width at half-maximum for the transition |4, — 4) |5, —5) with 
6=0. A more detailed derivation of the coupling constant and its estimate at @=0° 
is available in ref. 17. 

Average intra-cavity photon number fi. The average intra-cavity photon number fi 
4k Pin 
Fuso( t+ K2+ KR)?” 


in which «; and «2 are the couplings to the input and output antennas and Ky, 
represents the internal losses of the resonator. From a previous calibration of the 
experimental setup!”, we estimate that we can determine Pi, with an accuracy of 
approximately 1 dB. The values of #1, «2 and Ky, are determined experimentally by 
measuring each element of the resonator scattering matrix and fitting to the well- 
known input-output formulae*’; see Extended Data Table 2. 

Data acquisition and echo signal. The full description of the experimental set-up 
is available in ref. 17. The use of a Josephson parametric amplifier allows us to reach 
a quantum-limited sensitivity. In addition to the Hahn echo sequence, we use a 
Carr-Purcell-Meiboom-Gill sequence™ for every echo acquisition. For all Ag data 
points presented in this work, 10 x pulses are added after the first echo to recover 
10 extra echoes, which are subsequently averaged to boost the signal-to-noise ratio. 
This scheme allows us to acquire data in single-shot read-out. Each Ag data point 
is a single-shot measurement and the error bars are determined by the variance of 
a pool of at least n = 200 measurements, taken in similar conditions. 
Experimental determination of T; at resonance. The inversion-recovery 
sequence is used to measure the spin relaxation time T}; see Fig. 2d. Spin polari- 
zation is measured with the following Hahn echo sequence: 50-1s-long 7/2 pulse, 
delay T= 5001s, and 100-1s-long 1 pulse. The pulse durations were chosen such 
that only spins within a narrow spectral range were detected, producing a well- 
defined Purcell-limited T,. Indeed, because the probed ensemble of spins has a 
larger linewidth Aw = 2 MHz than do our resonators, the signal emitted during 
the spin-echo comes from a subset of the ensemble of spins, with a frequency 
spectrum at least as large as the resonator bandwidth. Spins probed at the edges 
of the bandwidth of the resonator will have longer Purcell relaxation times; for 
instance, those detuned by 6=« have an expected Purcell relaxation time that 


of a pulse of power Pi, at the cavity input is evaluated as 7 = 


is five times slower than the T| time expected at perfect resonance; see Extended 
Data Fig. 1a. The contribution of those spins with a longer decay time to the signal 
will result in an averaging effect, meaning that the measured T; will be erroneously 
longer than predicted. 

To suppress this effect, we reduce the bandwidth of the read-out sequence to 
collect signal only from spins very close to the resonance. The response function 
of a pulse of length ¢, incident on a cavity with bandwidth « at frequency wy is 
expressed as 


[2sinc(tp(w —wo) /2) 
14+4(w —wo)?/K? 


R(w) = [2sinc(tp(w —wo)/2)? Reay(w) = 


in which sinc(x) = sin(x)/x. As shown in Extended Data Fig. 1a, for the narrowest 
bandwidth «/(27) = 23 kHz of resonator A, pulses of 5 \1s are heavily filtered by 
the resonator and have the same bandwidth, whereas 100-1s-long pulses have a 
reduced bandwidth of approximately 10 kHz. In case of 100-1s-long excitation 
pulses, the Rabi frequency is such that only spins with |6|/(27) <5 kHz will con- 
tribute to the signal. This corresponds to a dispersion of only 5% for the expected 
Purcell relaxation times, which is negligible. To illustrate the averaging effect, 
two inversion-recovery curves are shown in Extended Data Fig. 1b, with readout 
pulses of 51s and 1001s. The former yields T; = 0.65 s, which is a factor two higher 
than predicted by the Purcell effect, whereas the latter yields the expected value 
T,=0.35s. 

Therefore, Fig. 2d shows an inversion-recovery sequence that has a read-out 
echo sequence with a narrow bandwidth (t, = 10018, t;/2=t,/2) to suppress con- 
tributions from spins with a lower decay rate, and an inversion pulse with a large 
bandwidth (¢, = 51s) to maximize the efficiency of the inversion. 

Given that the spin energy relaxation time T; is of the order of 1s, we choose 

a repetition rate 7;ep that is sufficiently low to allow full relaxation of the spins 
between successive inversion-recovery sequences: rep = 0.04 Hz. 
Experimental determination of spin-cavity-detuning-dependent relaxation 
rates. The spins are detuned from the cavity by applying an additional bias pulse 
on one of the Helmholtz coils used to apply the static field Bp. The extra bias 
pulse is output by a pulse generator with 50 ( output impedance placed in par- 
allel to the d.c. supply of one of the Helmholtz coils. To minimize the effect of 
transients due to the 1-Hz bandwidth of the coils, buffer times of 1s are added 
after ramping the coil up and down. To limit the loss of signal during these 
buffer times, we use an angle 6 = 45° and work with resonator B in order to 
have a longer T)(0) = 1.68 s. Applying a magnetic-field pulse to a single coil 
instead of both coils perturbs 0 by at most 4°. The value of T|(0) was measured 
with inversion recovery. All the data presented in Fig. 3 and in Extended Data 
Fig. 2 were acquired in a separate run. The quality factor of resonator B decreased 
from Q= 1.07 x 10° to Q=8.9 x 10* owing to slightly higher losses, yielding the 
resonator bandwidth «/(27) = 82 kHz. 

To observe the long relaxation times, such as those measured in Fig. 3, inver- 
sion recovery is not an ideal method. When the spin linewidth is broader (about 
20 times) than the excitation bandwidth and when the thermalization time is very 
long, one can observe polarization mixing mechanisms****, spectral and spatial 
spin diffusion being the most relevant to our case, because the spin system is com- 
posed of only one species. If we try to measure the relaxation from spins that have 
been detuned by an amount 6/(27) = (w, — wo)/(27) = 3.8 MHz during a lapse of 
time T with an inversion-recovery sequence (Extended Data Fig. 2a), then we 
observe a double-exponential relaxation (Extended Data Fig. 2d, green), which 
we attribute to the existence of a spin-diffusion mechanism. 

Spin diffusion is prevented by suppressing any polarization gradient along the 
spin line, which leads us to use a saturation-recovery scheme instead of inver- 
sion recovery. The simplest saturation-recovery scheme (Extended Data Fig. 2b) 
consists of sending a strong microwave tone that results in the saturation of the 
line, producing an incoherent mixed state with the population evenly distributed 
between excited and ground states. Nevertheless, a relaxation time measured using 
this scheme still yields a double-exponential decay (Extended Data Fig. 2d, orange), 
with time constants similar to those for the inversion recovery case. This implies 
that the saturation of the line is insufficient. 

To improve the saturation, we can sweep the magnetic field during the satura- 
tion pulse to bring different subsets of the spin line to resonance and realize a full 
saturation. The adopted sweep scheme is shown in Extended Data Fig. 2c. The 
corresponding relaxation curve fits well to a simple exponential decay (Extended 
Data Fig. 2d, blue), indicating the suppression of the spin-diffusion effect. 

We further check the quality of the saturation by measuring the polarization 
across the full spin linewidth immediately after saturation. To realize such scans 
(Extended Data Fig. 2e), we apply the relevant saturation pulse at wo, then apply a 
magnetic field pulse Bs = (ws — wo)/ye and measure the echo signal AQ(ws) with a 
Hahn echo sequence. When no saturation pulse is applied, the measured echo 
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signal Ago(w,) is a measure of the full polarization — (S,(w,)) =+1(Extended Data 
Fig. 2e, black curve) and shows the natural spin linewidth. When studying an 
excitation pulse, the polarization of the spins is — (S_(ws)) =AQ (ws) /Ago (ws), in 
which Ag(w,) is the measured echo signal. Therefore, — (S,(w;)) =—1 indicates 
full inversion, (S,(ws)) = 0 indicates saturation and — (S,(w,)) =+1 indicates 
return to thermal equilibrium. The green, orange and blue curves are taken after 
a7 pulse, after saturation without field sweep and after saturation with field sweep, 
respectively. At resonance, we expect a change of S, from —1 to +1 for a 7 pulse 
and from —1 to 0 for a saturation pulse. Owing to the coil transient time, all three 
curves show a partial relaxation. If the saturation was optimal and no partial relax- 
ation was occurring, then we should observe S,=0 for any detuning 6. For the two 
saturations (with and without field sweep) studied here, only that with field sweep 
equally saturates the line. The basic saturation with field sweep has a bandwidth 
of approximately 250 kHz and the bandwidth of the v pulse is similar to that of the 
cavity K/(27) = 82 kHz. This finding confirms that spin diffusion is fully suppressed 
only in a scheme of saturation with field sweep, to yield a simple exponential-decay 
relaxation. This scheme is used to measure the 22 relaxation rates at different 
detunings 6 shown in Fig. 4. 

The global fit shown in Fig. 4c is obtained using [T}(5)]~!=Ip+ nr, which 

may be expressed as[7;(0)]~-![1+4(6/k)?]-! + Ir to involve only experimentally 
determined parameters. Indeed, k is precisely determined by measuring the qual- 
ity factor of the resonator at low power, and T}(0) is determined by an inversion- 
recovery sequence, as mentioned above. The parameter 6 was determined via pre- 
cise calibration of the coil pulse. Therefore, the only remaining free parameter in 
the fit is Typ yielding "yp =1, 600 s. The error bars come from the accuracy of 
the fits of the relaxation rates. 
Practical considerations for the application of cavity-induced relaxation in 
magnetic resonance. The experiments described in the main text take place at 
low magnetic field (Bp < 10 mT) and low temperature (T~ 20 mK) using a dilution 
refrigerator. These are unusual conditions for magnetic resonance measurements; 
however, as we discuss below, with straightforward modifications, cavity-induced 
relaxation could be observed in other environments, broadening the class of spin 
systems that could be used. 

Superconducting microresonators can withstand large magnetic fields (up to 
approximately 1 T) while maintaining a large quality factor (Q;~2 x 10°)37-*? if 
they are patterned in metals such as Nb, NbN or NbTIN, instead of Al, as we have 
used. The use of these alternative metals would enable our results to be applied 
to a much larger class of spin systems, including typical electron spins with g~ 2. 
Similar observations can be made for temperature: Nb, NbN and NbTiN have a 
higher critical temperature than does Al, which would permit the use of temper- 
atures of 1-4K (accessible with conventional liquid helium cryostats). However, 
temperature is important for reasons other than helping to maintain a small x, 
because the Purcell effect brings spins into thermal equilibrium with the cavity 
field; for example, at the microwave frequencies used in our experiments (7.3 GHz), 
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temperatures below 70 mK are required for a spin polarization of >99%. Higher 
temperatures could be used at the cost of the degree of spin polarization, but this 
issue could be addressed by moving to higher frequencies. A third factor when 
considering the operating temperature is that cavity-induced relaxation can only 
be exploited when it dominates over intrinsic processes such as spin-lattice relax- 
ation. For most spin systems, this requirement translates into temperatures similar 
to those of liquid-helium. 

The possibility of cavity-induced relaxation with conventional electron spin 
systems might lead to applications other than those that benefit from a faster 
return to thermal equilibrium to increase signal averaging rates. In particular, we 
consider the possibility of cavity-assisted dynamic nuclear polarization, via either 
the so-called solid effect or the Overhauser effect, which was recently observed 
in solids*°. With the solid effect, the equilibrium polarization of a nuclear spin of 
frequency w, coupled to an electron spin of frequency w is enhanced by irradiating 
it with microwaves at w+ Wp, provided the electron spins return quickly enough 
to equilibrium. Tuning a cavity on resonance with the electron spin transition at 
we could provide an alternative relaxation mechanism to phonons, thereby avoid- 
ing, for example, phonon bottleneck effects and/or mitigating the need to apply 
large magnetic fields. With the Overhauser effect, saturating the spin transition 
by applying microwaves at w, enhances the nuclear spin polarization because of 
the existence of electron-nuclear spin cross-relaxation processes, which could be 
enhanced by tuning a cavity at we — wp. 

Sample size. No statistical methods were used to predetermine sample size. 
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Extended Data Figure 1 | Effect of excitation-pulse bandwidth on and k/(27) = 23 kHz. b, T; measurements for two different 7-pulse lengths 
the measurement of T). a, The red and blue lines shown the computed (see insets), measured on resonance with resonator A. Spin polarization is 
pulse bandwidth (‘normalized response’) for a 5-1s 7 pulse and a 100-1s measured with a Hahn echo sequence and Ag is rescaled by its value for 
T pulse, respectively, incident on a cavity with «/(27) = 23 kHz (green T >T, (Ag(T=00)’). Symbols are data and solid lines are exponential fits. 
dashes). To illustrate the averaging effect of the pulse bandwidth on T; The 100-1s x pulse (blue) yields T; = 0.35 s, which is in agreement with 
measurements, the expected Purcell T, curve (black line) as a function the Purcell rate. The 5-j1s 7 pulse (red) yields T; = 0.65, a factor of two 
of spin-cavity detuning is plotted on the right axis, with T\(0) = 0.35 s greater than the accurate value. 
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Extended Data Figure 2 | Spectral spin diffusion. a—c, T; measurement 
sequence when spins are detuned from the cavity by applying a magnetic 
field Bs, providing a detuning of 6 = w, — wp = 27YefB5, with 

Yert= df/dB(Bo) the effective gyromagnetic ratio, evaluated as the 
derivative of f= 21w, with respect to the applied magnetic field B at a 
given magnetic field Bo. In a, a 5-j1s 7 pulse is used to realize an inversion- 
recovery sequence; in b, a 1-s-long strong microwave pulse sent at cavity 
resonance is used to realize a saturation-recovery sequence; in c, a 
magnetic field scan (bottom panel) is used in addition to a 6-s-long strong 
microwave pulse to realize a saturation-recovery sequence. The expected 
magnetic field profile due to the coil filtering, assuming that the coil is an 
order-one low-pass filter with a bandwidth of 1 Hz, is shown in orange 

(c, bottom panel). d, T; measurements for sequences shown in a (green), 
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b (red) and c (blue) for 6/(2%) = 3.8 MHz. The fits (black lines) to the 
green and red data have a double-exponential decay, whereas the fit to the 
blue data is a simple exponential. We attribute the double-exponential 
decay (with extracted characteristic times T1A and T1B) to spin diffusion. 
e, Spectral profiles of the excitation pulse sequences shown in a (green), 

b (red) and c (blue). The sequence is as follows: send the excitation pulse, 
detune the spins and measure AQ(w,). The black line is the reference 
profile without any excitation pulse, yielding the reference polarization 
(Sz(Ws)) = —Aqo(ws)/Ago(ws) =— 1. When an excitation pulse is sent, 

we can access (S,(W;)) =—AQ(ws)/Ago(ws). To conserve the line shape 
profile, we plotted Ag(ws)/Ago(wo) instead of Ag(ws)/Ago(ws). Neither the 
t profile nor the saturation profiles reach the full inversion +1 or the full 
saturation 0 at resonance; this is an artefact due to the coil transient time. 
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Extended Data Table 1 | Relevant Bi:Si transitions and their characteristics 


Transition df /dB (F,mp| S,|F + 1,mp — 1) 
|4,—4) 4 |5,-5) —25.1GHz/T 0.47 
|4,—3) 4 |5,-4) -19.2GHz/T 0.42 
4,2) + |5,-3) -13.5GHz/T 0.37 
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Extended Data Table 2 | Resonator characteristics 


Resonator wy [2 Q Ky (s~') Ko (s~') Kz (s~') 
A 7.2467 GHz 3.2 x 10° 1.3 x 104 5.8 x 104 7.5 x 104 
B 7.3054 GHz 1.1 x 10° 3.6 x 104 3.1 10° 8.2 x 104 
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Condensation on slippery asymmetric bumps 


Kyoo-Chul Park!, Philseok Kim?, Alison Grinthal!, Neil He!, David Fox!, James C. Weaver? & Joanna Aizenberg!*? 


Controlling dropwise condensation is fundamental to water- 
harvesting systems'~3, desalination*, thermal power generation* *, 
air conditioning’, distillation towers’, and numerous other 
applications**»!!, For any of these, it is essential to design surfaces 
that enable droplets to grow rapidly and to be shed as quickly as 
possible*-’. However, approaches**!°-*! based on microscale, 
nanoscale or molecular-scale textures suffer from intrinsic trade- 
offs that make it difficult to optimize both growth and transport at 
once. Here we present a conceptually different design approach— 
based on principles derived from Namib desert beetles*??"“4, cacti”, 
and pitcher plants!””°— that synergistically combines these aspects 
of condensation and substantially outperforms other synthetic 
surfaces. Inspired by an unconventional interpretation of the role 
of the beetle’s bumpy surface geometry in promoting condensation, 
and using theoretical modelling, we show how to maximize vapour 
diffusion flux??”?8 at the apex of convex millimetric bumps by 
optimizing the radius of curvature and cross-sectional shape. 
Integrating this apex geometry with a widening slope, analogous to 
cactus spines, directly couples facilitated droplet growth with fast 
directional transport, by creating a free-energy profile that drives 
the droplet down the slope before its growth rate can decrease. This 
coupling is further enhanced by a slippery, pitcher-plant-inspired 
nanocoating that facilitates feedback between coalescence-driven 
growth and capillary-driven motion on the way down. Bumps that 
are rationally designed to integrate these mechanisms are able to 
grow and transport large droplets even against gravity and overcome 
the effect of an unfavourable temperature gradient. We further 
observe an unprecedented sixfold-higher exponent of growth rate, 
faster onset, higher steady-state turnover rate, and a greater volume 
of water collected compared to other surfaces. We envision that 
this fundamental understanding and rational design strategy can 
be applied to a wide range of water-harvesting and phase-change 
heat-transfer applications. 

The central concepts for integrating the growth and transport of 
water droplets are derived from a combination of strategies used by 


Asymmetry 


Macroscopic bump topography 


Figure 1 | An overview of our approach. The convex millimetric bump 
topography of water-harvesting desert beetles inspired the rational design 
of bumps that optimize fast, localized droplet growth by focusing vapour 
diffusion flux (green dotted arrows) at the apex. We further designed the 
bump to include an asymmetric slope analogous to that of cactus spines for 
capillary-guided directional transport of harvested droplets, and a pitcher- 
plant-inspired molecularly smooth lubricant immobilized on nanotexture. 
The green dash-dotted line represents the depletion layer. The green, black 


three distinct biological examples—Namib desert beetles*??-*4, cacti”, 


and Nepenthes pitcher plants*°—as summarized in Fig. 1. Several 
desert beetle species harvest water using their bumpy backs, but the 
topography of these surfaces has been overlooked, primarily because 
most previous studies attributed preferential condensation to surface 
chemistry (hydrophilic bumps with hydrophobic surroundings) and 
discounted convex topography as inferior to concave, on the basis of 
research into microscale and nanoscale textures!*!4+727939. However, 
the beetle bumps are large (millimetres across), and recent studies have 
reported that the entire bumpy surfaces are homogeneously covered 
with hydrophobic wax’?4, thus questioning the role of localized sur- 
face chemistry in promoting condensation. We considered instead that 
even in the absence of localized chemical patterning or microscale/ 
nanoscale textures, the specific geometry of convex millimetre-sized 
surface structures alone could facilitate condensation, and therefore the 
topography of synthetic bumpy surfaces could potentially be designed 
to optimize fast, localized droplet growth by focusing vapour diffu- 
sion flux”*?””8 at the apex. We further hypothesized that fast growth of 
millimetre-sized droplets could be coupled to rapid directional turn- 
over by integrating optimized apex geometry with an asymmetric 
slope analogous to that used by cactus spines to guide capillary-driven 
transport of harvested water drops. Lastly, the negligible friction of the 
slippery coating of pitcher plants inspired us to coat the bumps with 
molecularly smooth lubricant immobilized on nanotexture to facilitate 
these topography-based mechanisms. 

To test the idea that droplets grow faster on convex millimetre-sized 
surface structures by focused diffusion flux, we measured the diam- 
eter of the largest droplet growing on the apex of a simple spherical- 
cap-shaped bump shown in Fig. 2a (see Supplementary Methods for the 
fabrication process, Supplementary Tables 1 and 2 for surface characteri- 
zation, and Supplementary Fig. 1 for experimental setup). In this test, the 
depletion layer thickness 6 (of the order of 10 mm), where convection 
normal to the surface is negligible, is much greater than the height of the 
cap (H~0.8 +0.25mm), making diffusion of water vapour the domi- 
nant mass transport mechanism7®. As shown in Fig. 2b, the diameter of 


Molecular-scale smooth 
lubricant on nanoscale texture 


Slippery asymmetric bump 


and red solid arrows represent the directions of convection, gravity g and 
drop transport, respectively. Approximate species dimensions are: desert 
beetle, 1.5 cm long; cactus, 20 cm in diameter; pitcher plant, 15 cm tall. 
Images are adapted with permission from Wikimedia Commons (desert 
beetle photograph by Hans Hillewaert, cactus photograph by Stan Shebs), 
and ref. 26 (pitcher plant photograph, copyright (2004) National Academy 
of Sciences, USA). 


lJohn A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA. 2Wyss Institute for Biologically Inspired Engineering, Harvard University, 
Cambridge, Massachusetts, USA. 7Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, USA. 
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Figure 2 | Design of, and droplet growth on, convex bumps. 
a, A profilometer image of a hexagonal array of millimetre-sized spherical- 
cap-shaped bumps (left), and the location of the plane (green dash-dotted 
line, right), below which diffusion is the dominant mechanism of mass 
transport (depletion layer thickness 6>> H). b, Time-lapsed images of 
droplets growing on the apex of the bumps (top row) compared to a flat 
region with the same height H (bottom row). Yellow circles indicate the 
largest droplet in each image (see Supplementary Video 1). c, A schematic 
illustration indicating the radius of curvature «~!, periodicity Poattern» 
and half-bump width Rpump. d, Predicted diffusion flux Jc on the apex 


the largest droplet on the apex of the bump, denoted by yellow dotted 
circles in the top images, is greater than that of a droplet on the nearby 
flat regions of the same surface (yellow dotted circles in the bottom 
images) at each time point (see Supplementary Video 1). Detailed stud- 
ies were performed to rule out the role of surface roughness, chemistry 
and temperature differences in promoting such site-specific droplet 
growth (see the ‘Control Experiments’ section of the Supplementary 
Information, Supplementary Video 2, and Supplementary Figs 2-4 for 
discussion). These initial results indicate that the convex macroscopic 
surface topography—positive radius of curvature «~' in Fig. 2-—plays 
an important part in controlling diffusion flux. 
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of a spherical bump as a function of the radius of curvature «~! (see 
Supplementary Fig. 5 for the analytical model used in the derivation of Jc). 
UJaat is the predicted diffusion flux on a flat surface with the same depletion 
layer thickness.) e, Numerically calculated intensity profile of diffusion 
flux (COMSOL-Multiphysics). f, Time-dependent droplet growth on 
bumps with decreasing radii of curvature. 2rmax is the averaged diameter 
of the three largest droplets. g, Numerical calculation of diffusion flux and 
quantitative analysis of droplet growth on rectangular bumps of width W. 
h, Time-dependent droplet growth on bumps with decreasing width. 

All error bars are 1 s.d. 


To optimize the focused diffusion flux, we developed predictive 
models that quantify the magnitude and spatial profile of vapour flux 
as a function of the radius of curvature. A plot of the simple scaling 
of diffusion flux near the apex of a spherical cap (Jc x 1/K 1; see the 
“Theoretical modelling’ section of the Supplementary Information 
and Supplementary Fig. 5 for a detailed derivation) shows that the 
smaller the radius of curvature, the greater the localized diffusion flux 
(Fig. 2d). Numerical calculation (Fig. 2e) using the COMSOL- 
Multiphysics program (https://www.comsol.com/comsol-multiphysics) 
enables us to visualize more precisely the spatial distribution and 
intensity of diffusion flux on the bump and surrounding flat regions. 
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Figure 3 | Capillary-driven transport. a, Transport (red arrows) on 

a rectangular bump (left) and an asymmetric slope (right). b, Energy 
profile of the asymmetric bump-droplet-vapour system, obtained by 
finite-element-based numerical calculation (Surface Evolver). E; and Eo 
represent the calculated energy values of the system in the purple and 
red boxes, respectively. c, A profilometer image (oblique view) showing 
nanostructure (inset) infused with lubricant, and the tangentially 
connected bottom slope. d, e, Time-lapsed optical images of condensed 
water droplets on an asymmetric bump rotated 180° (d) and 90° (e) 
relative to gravity, corresponding to Supplementary Video 3 (starting at 


As indicated by the red to yellow colour gradient, the area where the 
diffusion flux is greater than on the flat region with the same height 
becomes increasingly concentrated at and around the apex, and its 
maximum intensity grows stronger, as the radius of curvature decreases. 

Consistent with the analytical and numerical models, we experi- 
mentally observe the largest droplet diameters at the apex of spherical- 
cap-shaped surface features that have the smallest radii of curvature 
(Fig. 2f, Supplementary Table 3). However, the rate of droplet growth 
on the bump with the smallest radius of curvature begins to slow down 
at later time points (see black curve in Fig. 2f). This change of slope 
suggests that the effect of the focused diffusion flux at the apex dimin- 
ishes when this region (represented by the red colour in the illustration 
in Fig. 2f) becomes covered by the growing droplet. To maintain the 
advantages of the small radius of curvature but avoid this decrease in 
growth rate, we changed the geometry of the apex from spherical to 
rectangular, with a flat region on top bordered by rounded edges as 
shown in Fig. 2g and h. This shape allows us to incorporate an even 
smaller radius of curvature around the perimeter, combined with an 
additional area of focused flux on the top flat area. For a smaller width, 
the superposition of diffusion flux focused on these features collectively 
results in a larger contiguous area of high diffusion flux. Droplets on 
the rectangular structure, therefore, continue growing for a longer time 
at a constant growth rate, as shown in Fig. 2h, because the coalescence 
and movement of the growing droplets to the flat top area of the bump 
continues to provide fresh sites for re-nucleation and growth. 

As the growing droplet begins to cover the curved edges, the shape of 
the rectangular structure—flat with curved borders—also lends itself to 
a mechanism to transport the droplet directionally, when topographical 
asymmetry is integrated into the design. Whereas a droplet growing 
ona rectangular column will eventually fall off in a random direction 
(Fig. 3a), the addition of a gradually widening slope descending from 
one side is expected to promote downward motion by enabling the drop 
to transition to a completely flat surface. As shown by the numerical 
calculation (Supplementary Fig. 6) the total free energy of the droplet- 
bump-vapour system is lowest when the droplet is on a completely 
flat region of such an asymmetric convex structure where the diameter 
of a droplet d equals the width of the underlying asymmetric convex 
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running time ~19s) and Supplementary Video 4 (starting at running time 
~8s). The droplet shown by the yellow arrow partially covers the curved 
border (indicated by a higher reflection that can be seen as a thin bright 
region between the black sides and the grey flat top) of the asymmetric 
bump (t —t,.=—10 s, where f, is the time of completed coalescence). 

The yellow asterisk indicates coalescence with another drop. The dotted 
yellow line tracks the vertical (d) or horizontal (e) progress of the droplet. 
See Supplementary Videos 3-5 for more information about the role of a 
gradually widening slope. 


structure W (Fig. 3b). Simulation (using the Surface Evolver program; 
http://facstaff.susqu.edu/brakke/evolver/evolver.html) suggests that 
the capillary force resulting from this energy profile on a surface with 
negligible friction would lead the drop to move down the slope towards 
the wider flat area (represented by the yellow surface in Fig. 3b) such 
that it will no longer overlap with the curved regions. 

To experimentally validate the latter hypothesis, we fabricated asym- 
metric bumps with a tangential connection between the descending 
slope and the surrounding flat regions (Fig. 3c). The negligible friction 
and pinning assumed in the numerical calculation were achieved by 
incorporating the pitcher-plant-inspired slippery nanocoating!!”"8 on 
top of the entire surface. On the fabricated slippery asymmetric struc- 
tures, droplets move even against gravity (Fig. 3d and Supplementary 
Video 3) because in such a system the capillary effect is dominant 
compared to gravitational effect (as captured by the Bond number— 
the ratio between gravitational force and capillary force—which is 
(Pwater — Pair)@17/V1v © 1/7 at the length scale of the droplet, where Pwater 
and pair are the density of water and air, gis the gravitational constant, r 
is the radius of the droplet, and yy is the surface tension of the water- 
vapour interface). In addition, even though the lowest energy point for 
the moving droplet with a constant diameter is not the bottom of the 
bump (that is, its widest region) and the droplet might be expected to 
be pinned upon reaching the point where d=W, the droplet (indicated 
by a yellow arrow in Fig. 3d) keeps moving and accelerating as its size 
grows by coalescing with other small droplets on its path (for exam- 
ple, as indicated by an asterisk in Fig. 3d), thus continuing to satisfy 
the condition d > W. This mechanism guides droplet motion solely 
along the direction determined by the widening slope, regardless of 
the orientation of the bump relative to gravity, as shown in Fig. 3e and 
Supplementary Videos 4 and 5. 

When surface structures are designed to optimize and integrate all 
of these mechanisms cooperatively, droplets rapidly grow and start to 
shed much earlier than on other state-of-the-art surfaces. Figure 4a 
provides quantitative analysis of the droplet growth as a function of 
time, comparing the performance of slippery and superhydrophobic 
surfaces with and without asymmetric bumps. In agreement with ear- 
lier studies®”!!”, flat slippery surfaces outperform superhydrophobic 
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Figure 4 | Coupled fast growth and transport. a, Quantitative analysis 
of growth dynamics and shedding compared to state-of-the-art surfaces. 
Both slippery asymmetric bumps and superhydrophobic asymmetric 
bumps show faster localized droplet growth in the early stage (t< 10°s) 
compared to flat slippery and superhydrophobic control surfaces. 

Each data point represents the averaged value of the largest droplet’s 
growth on each of three different bumps, or locations in the case of flat 
surfaces. The red box indicates further enhanced growth rate on slippery 
asymmetric bumps. ffirst is the average time at which the first three drops 


surfaces in droplet size measured at a given time. Moreover, owing 
to the higher diffusion flux at the apex of the convex features, both 
slippery asymmetric bumps (solid black line) and superhydrophobic 
asymmetric bumps (dotted black line) show faster localized droplet 
growth in the early stage (t< 10°s) than do their flat controls. Even 
more interestingly, surfaces with slippery asymmetric bumps show a 
unique discontinuous behaviour, with a slope of ~0.82 at the early stage 
and ~6.4 at the later stage of droplet growth, which is more than six- 
fold higher than the maximum slope (~1) observed in typical droplet 
growth dynamics?””°. Supplementary Figs 7 and 8 demonstrate the 
importance of combining all the described features of the multi-scale 
surface structures in achieving successful droplet growth and transport: 
the absence of any of the design components noticeably slows down 
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are transported solely by gravity. Each data point in the red box represents 
the average size of these three drops. The inset shows a magnified view 
of the fast growth. All error bars are 1s.d. b, Time-lapsed images of 
droplets condensed on slippery surfaces. c, An exemplary array of the 
slippery asymmetric bumps (left image, black line in the plot) shows a 
substantially greater volume of water collected at the bottom of the surface 
(see Supplementary Video 6), compared to the flat slippery surfaces 
(right image, blue line in the plot). 


droplet growth dynamics. The accelerated growth (slope of 6.4) cap- 
tured by the magnified view in the inset of Fig. 4a (see Supplementary 
Fig. 9 for more information about the accelerated growth of individual 
droplets compared to other controls) can be interpreted as the feedback 
between coalescence-driven growth and capillary-driven transport 
discussed above. As a result, the fast-growing droplets on the slip- 
pery asymmetric bumps, which are aligned with gravitational force, 
are delivered to the bottom of the slope at a size where they can then 
be transported by gravity, while droplets on the adjacent flat slippery 
surfaces are still far below the critical shedding diameter (Fig. 4b). 

In our experiments, droplets shed from the slippery bumps 
within tgrst 10°s (where tfrst is the average time at which the first 
three drops are transported solely by gravity), whereas droplets on 
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other state-of-the-art surfaces grow slowly, and shed much later (for 
example, ffirst 4 < 10°s on flat slippery surfaces) or do not shed for 
more than t~ 10s (for example, on superhydrophobic surfaces and 
all other controls shown in Supplementary Figs 7 and 8). Owing to 
this faster droplet growth and transport performance, which yields 
a continuous, rapid steady-state turnover, a slippery surface with an 
exemplary array of the asymmetric structures designed in this study 
(Fig. 4c and Supplementary Video 6) shows a volume of water turn- 
over an order of magnitude greater than that of the flat slippery sur- 
faces and other state-of-the-art surfaces (for which tg, > 200 min as 
shown in Fig. 4a and Supplementary Fig. 7) developed for dropwise 
condensation? 71 b!?15:17-20, 

Further analysis of the steady-state turnover kinetics over 7h 
(Supplementary Fig. 10) highlights that the volume of water collected 
on the slippery surfaces with asymmetric structures progressively out- 
paces that collected on the control surface, with the advantage con- 
tinuously becoming greater with time even after both reach frst. By 
coupling fast growth and fast transport, the integrated bump design 
not only eliminates the long onset delay observed on other surfaces but 
also yields a substantially higher steady-state turnover rate. This com- 
bination of short response time and reliable, high-volume long-term 
performance are critical in numerous applications based on condensa- 
tion, transport, and phase-change heat transfer, such as heat exchange, 
dehumidification, and desalination systems. The central principles 
derived in this work further enable the manipulation of drop behaviour 
against an unfavourable temperature gradient. Although larger droplets 
are expected at a colder surface, a drop at the apex of a bump with a 
small radius of curvature shows the opposite trend: increased drop size 
at higher temperature (Supplementary Fig. 4a). This unusual behaviour 
is relevant to pipe geometry, an analogous form of convex curvature that 
is widely used in phase-change heat transfer (Supplementary Fig. 4b). 

In summary, we have achieved unprecedented droplet growth and 
transport by designing surfaces covered with slippery asymmetric 
bumps based on quantitative models that integrate three mechanisms— 
(1) optimization of focused diffusion flux by the radius of curvature 
and shape of the apex, (2) asymmetric topography that guides the 
droplet off the bump by optimizing the free-energy profile, and (3) 
positive feedback between capillary transport and continued growth by 
coalescence along the slope. 

The rapid turnover kinetics, with fast onset and sustained continuous 
shedding rate, combined with the ability to defy a temperature gradi- 
ent, are crucial not only for water-harvesting applications, particularly 
in hot, arid regions where condensed water droplets will evaporate if 
they do not shed after a limited time, but also for many phase-change 
heat-transfer applications requiring reliable steady-state performance. 
It may also be possible to create switchable water-transporting sur- 
faces with flexible topographical features that can be tuned by external 
stimuli, and to discover fundamental principles of droplet growth and 
transport in practical and more complex situations such as with strong 
airflow. 
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Copper and zinc form an important group of hydroxycarbonate 
minerals that include zincian malachite, aurichalcite, rosasite and 
the exceptionally rare and unstable—and hence little known and 
largely ignored’—georgeite. The first three of these minerals are 
widely used as catalyst precursors” “ for the industrially important 
methanol-synthesis and low-temperature water-gas shift (LTS) 
reactions® ’, with the choice of precursor phase strongly influencing 
the activity of the final catalyst. The preferred phase”**""” is usually 
zincian malachite. This is prepared by a co-precipitation method 
that involves the transient formation of georgeite!!; with few 
exceptions’? it uses sodium carbonate as the carbonate source, but 
this also introduces sodium ions—a potential catalyst poison. Here 
we show that supercritical antisolvent (SAS) precipitation using 
carbon dioxide (refs 13, 14), a process that exploits the high diffusion 
rates and solvation power of supercritical carbon dioxide to rapidly 
expand and supersaturate solutions, can be used to prepare copper/ 
zinc hydroxycarbonate precursors with low sodium content. These 
include stable georgeite, which we find to be a precursor to highly 
active methanol-synthesis and superior LTS catalysts. Our findings 
highlight the value of advanced synthesis methods in accessing 
unusual mineral phases, and show that there is room for exploring 
improvements to established industrial catalysts. 

For the SAS precipitation process, we initially used copper(11) 
acetate solutions (Extended Data Fig. 1) with ethanol as the solvent; 
the result was amorphous copper(11) acetate, which we character- 
ized by X-ray diffraction (XRD) and Fourier transform-infrared 
(FT-IR) spectroscopy. Adding 5 vol% water as a co-solvent produced 
a blue precipitate with an IR spectrum exhibiting a broad OH band 
at 3,408 cm™~! and CO3* bands at 1,470, 1,404 and 829 cm~!, which 
are inconsistent with malachite but closely match the spectrum iden- 
tifying the rare georgeite phase’. Helium pycnometry gave a density 
of 3.1gcm~? for the SAS-prepared phase; as with mineralogical geor- 
geite, the SAS-prepared phase was amorphous by XRD. We further 
studied the precipitate by inductively coupled plasma mass spec- 
trometry (ICP-MS) and carbon-hydrogen-nitrogen (CHN) analysis 
(Extended Data Fig. 2), which gave an ideal formula for the precipi- 
tate of Cu7(CO3)5(OH)4°5H20. We regard this to be a close match to 
the Cus(CO3)3(OH)4°6HO formula reported for georgeite', taking 
into account that slight variations in composition are anticipated for 
an amorphous phase’. The SAS-prepared and mineralogical forms 
of georgeite both have a higher CO3*/Cu’* molar ratio than mal- 
achite (the CO3?~/Cu*t values being 0.7, 0.6 and 0.5, respectively); 
this contrasts with previous assertions that georgeite and malachite 
(CuzCO3(OH),) are iso-compositional'® because georgeite rapidly 
transformed into malachite. We did not observe such instability for 
SAS-prepared georgeite, as the rapid solvent extraction produced 


georgeite that is effectively dry from the point of precipitation and 
thus has limited contact with water (which facilitates ageing and trans- 
formation to malachite). 

When we used mixed copper and zinc acetate solutions (with a 2/1 
or 1/1 copper/zinc ratio) for the SAS precipitation, zincian georgeite 
(with a 2/1 or 1/1 copper/zinc ratio, as judged by ICP-MS) formed 
with a density, IR spectrum and XRD pattern nearly identical to those 
of copper-only georgeite (Extended Data Figs 1 and 2); no additional 
phases—such as smithsonite (ZnCO3)—were discernible by XRD. 
Scanning transmission electron microscopy (STEM) analysis of zin- 
cian georgeite revealed predominantly amorphous material, although a 
small fraction (<10 vol%) of CuO, crystallites with diameters less than 
2 nanometres were observed (Extended Data Fig. 3; analysis of geor- 
geite itself was not possible because of its instability under vacuum). In 
addition, the SAS-derived zincian georgeite had few impurities (such 
as sodium ions; Extended Data Fig. 2). 

We therefore found that it was possible to synthesize stable zincian 
georgeite by SAS in sufficient quantities to evaluate the catalytic per- 
formance of copper/zinc oxide, derived from zincian georgeite, in the 
methanol-synthesis and LTS reactions. As a control, we synthesized 
a zincian-malachite precursor—considered the optimal precursor 
for model copper/zinc-oxide catalysts*—by co-precipitation (see 
Methods). The catalysts used here were not stabilized with alumina, 
an important promoter and stabilizer of the commercial catalyst!”"®. 
Before use, the precursors were subjected to calcination to remove the 
carbonate, followed by in situ reduction to form the active catalyst. The 
optimal calcination temperature was determined using thermal gravi- 
metric analysis (TGA) with evolved gas analysis (EGA), which showed 
SAS-prepared georgeite to exhibit three distinct mass losses, and mala- 
chite to exhibit a single mass loss, associated with concurrent removal 
of water and carbon dioxide (Fig. 1). The multistep decomposition of 
georgeite and zincian georgeite (the 1/1 copper/zinc sample shown in 
Extended Data Fig. 2) can be separated into three regions: water loss 
at 80-100°C, water and carbon dioxide loss at 190-210°C, and carbon 
dioxide loss at 315-420 °C (carbonate decomposition). On the basis of 
this analysis, the precursors were calcined at 300°C in air for 6 hours: 
this is below the temperature associated with the evolution of carbon 
dioxide, and residual carbonate species have been shown to improve 
the activity of methanol-synthesis catalysts'’. The calcined material 
was then reduced in situ in dilute hydrogen at 225°C. 

Figure 2 contrasts the activity of our model catalysts with that of com- 
mercial catalysts, showing that the catalyst derived from SAS-prepared 
zincian georgeite is more active than both the catalyst formed from 
zincian malachite and the standard commercial catalyst. In the case 
of methanol synthesis, for which our model catalysts showed mark- 
edly higher initial productivity (normalized for copper surface area; 
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OFA, UK. 3Kathleen Lonsdale Building, Department of Chemistry, University College London, Gordon Street, London WC1H OAJ, UK. Diamond Light Source, Didcot OX11 ODE, UK. ‘Department 
of Chemistry, University of Liverpool, Crown Street, Liverpool L69 7ZD, UK. Center for Electron Nanoscopy, Technical University of Denmark, Fysikve} 307, DK-2800 Kgs Lyngby, Denmark. 
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Figure 1 | Thermal gravimetric analysis (TGA) with evolved gas analysis 
(EGA) of georgeite and malachite. a, TGA of georgeite (black solid 

line), zincian georgeite (red solid line), malachite (black dotted line) and 
zincian malachite (red dotted line). b, EGA of georgeite (black lines) and 


Extended Data Fig. 4), the enhancement of activity was short-lived. 
But in the LTS reaction, the SAS-prepared zincian-georgeite-derived 
catalyst was far more active and stable over the entire test period; 
indeed, enhanced activity was apparent at all contact times investigated 
(Fig. 2). These observations demonstrate that the availability of stable 
zincian georgeite as a precursor gives access to highly active copper/ 
zinc-oxide catalysts, with the LTS activity and stability maintained 


100 200 300 
Temperature (°C) 
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Temperature (°C) 
zincian georgeite (red lines). c, EGA of malachite (black lines) and zincian 
malachite (red lines). In b and ¢, the solid lines indicate CO. (fragment 
with mass 44) and dashed lines indicate H2O (fragment with mass 18). 


using 30% less copper than is present in a catalyst derived from a 2/1 
copper/zinc georgeite (Extended Data Fig. 4). 

The stability and quantities of georgeite that are available through 
SAS allowed us to explore structural origins for its enhanced activity. 
X-ray-absorption near-edge structure (XANES) data for copper-only 
georgeite and malachite precursors (Fig. 3a) indicate that both materials 
contain Cu?* in a distorted octahedral environment”, in agreement 
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Figure 2 | Comparison of our model catalysts to commercial catalysts. 
a, Methanol synthesis (shown as mass-normalized, time-on-line 
methanol production). The dashed line denotes the representative 
reactor-bed temperature for methanol synthesis. Reaction conditions 

for methanol synthesis: 190-250 °C (shaded area shows reaction under 
equilibrium conditions at 250°C), 25 bar, gas composition CO/CO2/H2/ 
N> = 6/9.2/67/17.8, mass hourly space velocity (MHSV) =7,2001kg-"h7t. 
The selectivities of all of the catalysts to methanol are greater than 
99.96%. b, Low-temperature water-gas shift (LTS) reaction (shown 

as time-on-line CO conversion). The dashed line represents the 
maximum equilibrium conversion. Reaction conditions for LTS: 220°C, 
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27.5 bar, gas composition HyO/CO/CO3/H2/N2 = 50/2/8/27.5/12.5, 
MHSV =75,0001kg~'h~!. c, LTS conversions at different gas hourly 
space velocities. (Extended Data Fig. 4 shows LTS mass-normalized data, 
copper mass-normalized productivities/activities, initial copper surface 
area normalized productivities/activities and product impurities.) Red 
triangles, zincian-georgeite-derived catalyst; blue squares, co-precipitated 
zincian-malachite-derived catalyst; open squares, industrial standards 
(composition of the industrial methanol-synthesis catalyst by wt% is CuO/ 
ZnO/Al,O3 = 60/30/10; composition of the industrial LTS catalyst by wt% 
is CuO/ZnO/A1,03 = 50/33/17). Error-bar calculations are discussed in 
Methods. 
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Figure 3 | X-ray absorption fine-edge spectroscopy (XAFS) and X-ray 
pair distribution function (PDF) analysis. a, Copper K-edge XAFS 
information on local structure, for malachite (red line) and georgeite 
(black line). b, Fourier transform (FT) of extended X-ray absorption 
fine-edge spectra (EXAFS) k* weighted y data (Fourier transform k? 

x (R)), with associated fitting parameters for malachite (red line) and 
georgeite (black line). Variation in the magnitude of the FT is plotted 
with distance R (A) from the Cu absorber; k denotes the fitting window 
from the x data. The fit (shown with dashed lines) was modelled on four 
shorter Cu-O bond distances and two longer Jahn-Teller distorted Cu-O 
bond distances (Extended Data Table 1). c, Ultraviolet—-visible spectra 
for malachite (red line) and georgeite (black line), with centres of d-d 


with diffuse reflectance ultraviolet—visible spectroscopy (DRS) and 
extended X-ray absorption fine structure (EXAFS) data (Fig. 3b and 
Extended Data Table 1). But our analysis also suggests subtle differences 
in coordination geometries. Specifically, DRS locates the optical band 
associated with the metal centre(s) in georgeite at 770 nm, whereas the 
combined bands from the two crystallographic copper centres in mal- 
achite occur at 825 nm (ref. 21). Although XANES data for georgeite 
and malachite are similar, the Fourier transform of the EXAFS y data 
(Fig. 3b) shows that malachite has contributions further out from the 
copper absorber that are correlated to the scattering effects of copper- 
copper neighbours, whereas there are very few features in the georgeite 
EXAFS data beyond 2 A and no evidence of metal-metal correlation. 
The addition of zinc did not affect metal geometry in georgeite, with no 
discernible changes appearing in the copper EXAFS data or zinc K-edge 
XANES or EXAFS x-plots of zincian georgeite (such changes would 
be expected in the case of impurity phases such as ZnCO3) (Extended 
Data Fig. 5). 

X-ray pair distribution function (PDF) analysis, which does not suf- 
fer from phenomena related to scattering-path phasing, indicated that 
georgeite and zincian georgeite have strikingly similar and strong short- 
range order up to 5A (Fig. 3d); subtle differences appear between 5A 
and 10 A, with georgeite exhibiting less long-range order. The extended 
order seen upon addition of zinc could be responsible for the enhanced 
stability of zincian georgeite prepared by co-precipitation’®. A com- 
parison of the georgeite PDF with a calculated PDF of malachite” 
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transition bands (based on peak maxima) marked as small vertical lines 
(zincian phases are shown in Extended Data Fig. 5a). d, X-ray PDF data for 
georgeite and zincian georgeite. D(r) is the probability of finding a pair of 
atoms, weighted by the scattering power of the pair, at a given distance, r. 
The probability is scaled to r to emphasize order at greater distances. e, 
Top, observed PDF G(r) data (G(r) is equivalent to D(r), only without 
scaling by r) for georgeite. Bottom, calculated PDF G(r) data for all of the 
atom pairs of malachite (black), together with the partial contributions 
from the most strongly contributing atom pairs: C—O (red), Cu-O 
(green) and Cu-Cu (blue). The dotted line represents a contribution from 
crystallographically equivalent Cu atoms in malachite (see text). 


(Fig. 3e) shows that georgeite is not simply a nanoscale form of mala- 
chite (a phenomenon noted for other amorphous minerals, such as iron 
sulphide?>): the complete absence of the 15.27 A correlation between 
crystallographically identical copper atoms in georgeite shows that 
local ordering operates well below the length scales associated with 
the malachite unit cell. PDF matching to other copper phases such 
as aurichalcite~*, azurite” and rosasite*® confirmed georgeite to be a 
material distinct from any known copper hydroxycarbonate (Extended 
Data Fig. 5). 

After synthesis, the zincian georgeite and malachite precursors were 
calcined at 300°C to form copper-oxide/zinc-oxide intermediates, and 
then reduced to the active catalyst phase. We find that the intermediate 
materials produced on calcination and the final-state catalysts both 
exhibit a microstructure that reflects the structural characteristics of 
their precursor phase. In the case of SAS-prepared zincian georgeite, 
calcination produced disordered copper oxide/zinc oxide with nano- 
crystals of 3-4nm in diameter, as observed by STEM and PDF analysis 
(Extended Data Figs 6 and 7). The extent of the disorder resulted in 
XRD and XANES analysis showing no discernible metal-oxide con- 
tribution, although subtle changes in the EXAFS data were observed 
(Extended Data Fig. 8). Disordered copper-oxide species have previ- 
ously been reported to have copper-edge XANES data that fit with a 
Jahn-Teller distorted octahedron”’, as with the calcined zincian geor- 
geite. We found that the nanocrystallinity of the SAS-prepared zin- 
cian georgeite persisted until the residual high-temperature carbonate 
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Figure 4 | Characterization, by environmental transmission electron 
microscopy, of the microstructure of the reduced final-state catalysts 
derived from zincian georgeite and zincian malachite. Calcined 
materials were reduced under a H2 pressure of 2 mbar and a temperature 
of 225°C. The representative images shown in the top panels reveal 
distinct Cu nanoparticles distributed on ZnO for both samples. This 
was confirmed by fast Fourier transform (FFT) analysis, shown in the 
bottom panels. The reduced malachite sample generally has larger Cu 
nanoparticles than the georgeite sample. Also, there is a substantially 


decomposed to form XRD-discernible copper-oxide and zinc-oxide 
phases above 425 °C (Extended Data Fig. 7). Conversely, calcination of 
zincian malachite at temperatures greater than 300°C produced crys- 
talline, XRD- and XANES-observable, copper-oxide and zinc-oxide 
phases (Extended Data Fig. 7). A linear combination fit of the XANES 
data showed that 60% of the 300°C calcined zincian malachite 
(Extended Data Fig. 8) was associated with copper oxide and zinc oxide, 
whereas these phases were not present in the calcined zincian georgeite. 
STEM showed that 300°C calcined zincian malachite (Extended Data 
Fig. 6) was composed of copper-oxide and zinc-oxide grains with much 
longer-range crystalline order than the corresponding calcined zincian 
georgeite. 

The nanocrystalline nature of calcined zincian georgeite translates 
to a high copper surface area (53 m*g~!) and small mean copper crys- 
tallite size (7.40.7 nm) when the material is reduced. Environmental 
transmission electron microscopy (ETEM) carried out under reducing 
conditions shows a complex mixture of copper particles and excep- 
tionally small, disordered zinc-oxide grains (the mean zinc-oxide 
crystallite size was 2.8 + 1.6nm, as judged by XRD), with notable 
interactions between copper and zinc oxide (Fig. 4). By comparison, 
the more crystalline zincian-malachite-derived copper oxide/zinc 
oxide had a lower copper surface area (35mg!) anda microstructure 
composed of distinctly larger copper particles (mean crystallite size 
was 14.2 +3.8nm, by XRD) and zinc oxide particles (mean crystallite 
size was 5.6 +3.6nm by XRD). This difference in particle size and 
structural order is reflected in the smaller copper-copper and zinc- 
zinc coordination numbers noted for the zincian-georgeite-derived 
catalyst from in situ EXAFS analysis (Extended Data Fig. 9). Ona 
macroscopic level, strong interactions between copper and zinc oxide, 
as shown by XPS observations of partial ZnO, coverage of copper on 
reduction’, were observed for both catalysts (Extended Data Fig. 9). 
However, ETEM shows considerably more interactions between 
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greater Cu-ZnO interaction in the georgeite-derived sample than in the 
malachite-derived catalyst. The georgeite sample has a complex mix of 

Cu and poorly defined ZnO particles (as revealed by diffuse rings in the 
large-scale FFTs); the malachite sample shows better-defined, phase- 
separated Cu and ZnO. The oxidation state of Cu was monitored by means 
of electron energy-loss spectroscopy, which revealed that Cu was in its 
metallic state in both samples under the reduction conditions (Extended 
Data Fig. 9f). The Cu zone axis is the direction (as determined by analysis 
of the FFT) along which the Cu crystallite is being observed. 


copper and zinc oxide in the catalyst that was derived from disordered 
zincian georgeite. 

Methanol-synthesis activity is well known to correlate strongly 
with copper surface area’®, and a large interfacial area between copper 
and zinc oxide has been reported to produce highly active methanol- 
synthesis catalysts”, containing active sites that are associated with 
surface-defect copper sites decorated by zinc’. However, we find that 
40 hours of LTS conditions markedly decrease the copper surface 
areas of catalysts derived from both zincian georgeite (53 +3m*g~' to 
17+1m/’g~!) and zincian malachite (38 + 2m*g~! to 19+ 1m?g~!)— 
yet the catalysts were still active after this time period, clearly indicat- 
ing that copper surface areas do not adequately correlate with catalyst 
activity in this reaction. We surmise that the high activity of our catalyst 
can be explained by a combination of a greater content of exposed 
copper, and an intimate interface contact between copper and zinc 
oxide, inherited from the zincian-georgeite precursor. Moreover, the 
SAS method produces very pure zincian georgeite with a low content 
of sodium ions, without requiring an aqueous washing step; this high 
purity might also contribute to both the high activity and the stability 
of the zincian-georgeite-derived LTS catalyst. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Material preparation. SAS method for preparing georgeite and zincian georgeite. 
Copper(i1) acetate monohydrate (4 mg ml~!) and zinc(11) acetate dihydrate 
(2.16 mg ml!) (Sigma Aldrich > 99% Puriss) were dissolved in ethanol (reagent 
grade, Fischer Scientific) containing 0 vol%, 5 vol% or 10 vol% deionized water. 
Smithsonite ZnCO; was prepared with zinc(11) acetate dehydrate (2.16 mg ml!) 
in a 10 vol% water and ethanol solution. SAS experiments were performed using 
apparatus manufactured by Separex. CO2 (from BOC) was pumped through the 
system (held at 110 bar, 40°C) via the outer part of a coaxial nozzle at a rate of 
6kgh !. The metal salt solution was concurrently pumped through the inner noz- 
zle, using an Agilent HPLC pump ata rate of 6.4ml min !. The resulting precipitate 
was recovered on a stainless steel frit, while the CO2-solvent mixture passed down 
stream, where the pressure was decreased to separate the solvent and CO}. The 
precipitation vessel has an internal volume of 1 litre. Precipitation was carried 
out for 120 minutes, and followed by a purge of the system with ethanol-CO) for 
15 minutes, then CO, for 60 minutes under 110 bar and 40°C. The system was then 
depressurized and the dry powder collected. Recovered samples were placed in a 
vacuum oven at 40°C for 4 hours to remove any residual solvent. Approximately 1.5 g 
of georgeite is prepared during the 120-minute duration of solution precipitation. 
Co-precipitation method for preparing malachite and zincian malachite precursors 
for the standard methanol-synthesis catalysts. The procedure was performed via a 
semi-continuous process, using two peristaltic pumps to maintain pH. Copper(1) 
nitrate hydrate and zinc(11) nitrate hydrate solutions in deionized water were pre- 
pared with copper/zomc molar ratios of 1/0, 1/1 and 2/1. The premixed metal 
solution (5 1, ~0.5 gm!) was preheated along with a separate 5 | solution of 1.5M 
sodium carbonate. The mixed metals were precipitated by combining the two 
solutions concurrently at 65°C, with the pH being held between 6.5 and 6.8. The 
precipitate would spill over from the small precipitation pot into a stirred ageing 
vessel, held at 65°C. The precipitate was aged for 15 minutes after all precursor 
solutions had been used. 

The precipitate was than filtered and washed to minimize sodium content. The 
sample was washed with 6 1 of hot deionized water, and the sodium content mon- 
itored using a photometer. The washing process was repeated until the sodium 
content showed no change. The sample was then dried at 110°C for 16 hours. 
Samples were calcined for 6 hours at 300°C or 450°C in a tube furnace under static 
air. The ramp rate used to reach the desired temperature was 1°C min “1. 
Catalyst testing. Catalyst testing was performed with 0.5 g of the calcined catalyst, 
pelleted and ground to a sieve fraction of 0.6-1 mm for both the methanol-syn- 
thesis and the LTS test reactions. The catalysts were reduced in situ using a 2% H)/ 
N> gas mixture at 225°C (ramp rate 1 °C min“!), before the reaction gases were 
introduced. Data reported for the industrial standards are the mean value from 
four repeat experiments. Error bars are based on the standard deviation from the 
mean values. 

Methanol synthesis. Testing was carried out in a single-stream, six-fixed-bed reactor 
with an additional bypass line. After reduction, the catalysts were then subjected 
to synthesized syngas (CO/CO3/H2/N = 6.9/2/67/17.8) at 3.51h |, 25 bar pres- 
sure and 195°C. In-line gas analysis was performed using an FT-IR spectrometer, 
which detected CO, CO2, HO and CH3OH. Downstream of the catalyst beds, 
knockout pots collected effluent produced from the reaction. The contents were 
collected after each test run and analysed using gas chromatography to evaluate the 
selectivity of catalysts. The total system flow was maintained using the bypass line. 
LTS. Testing was performed in six parallel fixed-bed reactors with a single stream 
feed and an additional bypass line. After reduction, the catalysts were subjected 
to synthetic syngas (CO/CO/H3/N2 = 1/4/13.75/6.25) at 27.5 bar pressure and 
220°C. The reactant gas stream was passed though vaporized water to give a 
water composition of 50 vol%. This gives a total gas flow of HxO/CO/CO;/H)/ 
N2 = 50/2/8/27.5/12.5. The standard mass hourly space velocity (MHSV) used 
for testing was 75,000kh 'kg~!. In-line IR analysis was carried out to measure 
CO conversion. Selectivity was determined by the methanol content within the 
knockout pots downstream of the reactors. Space and mass velocity variation tests 
were performed at 65 hours’ time-on-line by altering the flow for each catalyst bed. 
Relative activities were calculated by altering the flow for each catalyst bed in order 
to achieve 75% CO conversion after 75 hours’ time-on-line. The total system flow 
was maintained using the bypass line. 

Characterization methods. Powder X-ray diffraction (XRD). Measurements were 
performed using a PANalytical X’pert Pro diffractometer with a Ni-filtered CuK, 
radiation source operating at 40 kV and 40 mA. Patterns were recorded over the 
range of 10-80° 20 using a step size of 0.016°. All patterns were matched using the 
International Centre for Diffraction Data (ICDD) database. An in situ Anton Parr 
XRK900 cell (with an internal volume of ~0.5 1) was used to monitor the formation 
of metallic copper during the reduction of the CuO/ZnO materials. A flow of 2% 
H,/N2 (60 ml min’) was passed through the sample bed while the cell was heated 
to 225°C (ambient temperature to 125°C, ramp ratel0°C min /; 125-225°C, ramp 


rate 2°C min '). The sample was then cooled to room temperature and a 20-100° 
20 scan performed. Copper crystallite size was estimated from a peak-broadening 
analysis of the XRD pattern using Topas Academic”, and the volume-weighted 
column height (Lyo1) was calculated according to ref. 31. 

X-ray absorption fine-edge spectroscopy (XAFS). K-edge XAFS studies of copper 
and zinc were carried out on the B18 beamline at the Diamond Light Source, 
Didcot, UK. Measurements were performed using a quick extended (QE) XAFS 
set-up with a fast-scanning silicon (111) double-crystal monochromator. The time 
resolution of the spectra reported herein was 1 minute per spectrum (kmax= 14), 
and on average three scans were acquired to improve the signal-to-noise ratio of 
the data. All ex situ samples were diluted with cellulose and pressed into pellets to 
optimize the effective-edge step of the XAFS data and measured in transmission 
mode using ion-chamber detectors. All transmission XAFS spectra were acquired 
concurrently with the appropriate reference foil (copper or zinc) placed between 
I, and Ijep XAS data processing, and EXAFS analysis were performed using 
IFEFFIT* with the Horae package** (Athena and Artemis). The FT data were 
phase-corrected for oxygen. The amplitude-reduction factor, s¢, was derived from 
EXAFS data analysis of a known copper reference compound, namely tenorite 
(CuO). For the fitting of the local coordination geometry of georgeite and mala- 
chite, the Jahn-Teller distorted copper-oxygen bond was difficult to observe 
because of thermal disorder, and so was fixed in the model. 

X-ray pair distribution function (PDF) analysis. Synchrotron X-ray PDF data were 
collected on the 11-ID-B beamline at the Advanced Photon Source, Argonne 
National Laboratory, and on the 115 beamline at the Diamond Light Source. 
Powder samples were packed into kapton capillaries with an internal diameter 
of 1mm. Room-temperature powder X-ray diffraction data were collected at a 
wavelength of 0.21 14A (11-ID-B) and 0.1620 A (115) using the Rapid Acquisition 
PDF method‘. The scattering data (0.5 < Q< 22 A~!) were processed into PDF 
data using the program GudrunX*. Two notations of PDF data are presented in 
this manuscript: G(r) and D(r) (ref. 36). The total radial distribution function, G(r), 
is the probability of finding a pair of atoms, weighted by the scattering power of the 
pair, at a given distance, 1; it is the Fourier transform experimentally determined 
total structure factor. D(r) is re-weighted to emphasize features at high r, such 
that D(r) = 47rpoG(r), where o is the average number density of the material (in 
atoms A~!). 

Fourier transform-infrared spectroscopy (FT-IR). Analysis was carried out using 
a Jasco FT/IR 660 Plus spectrometer in transmission mode over a range of 
400-4,000 cm". Catalysts were supported in a pressed KBr disk. 

Raman spectroscopy. Raman spectra were obtained using a Renishaw inVia spec- 
trometer equipped with an argon ion laser (A=514nm). 

Thermal gravimetric analysis (TGA). TGA measurements were performed using 
a SETARAM Labsys analyser with sample masses of about 20 mg, at 1°C min~! 
under air with a flow rate of 20ml min“1. 

Evolved gas analysis (EGA). EGA experiments were performed using a Hiden 
CATLAB under the same conditions used in the TGA experiments. 

Copper surface area analysis. This was determined by reactive frontal chromatog- 
raphy. Catalysts were crushed and sieved to a particle size of 0.6-1 mm and packed 
into a reactor tube. The catalyst was purged under helium for 2 minutes at 70°C 
before being heated under a 5% H) reduction gas to 230°C (8°C min!) for 3 hours. 
The catalyst was then cooled to 68°C under helium before the dilute 2.5% N.O 
reaction gas was added with a flow rate of 65 ml min‘. The formation of N2 from 
the surface oxidation of the copper catalyst by NxO was measured downstream 
using a thermal conductivity detector (TCD). Once the surface of the copper is 
fully oxidized, there is a breakthrough of NO, detected on the TCD. From this, 
the number of oxygen atoms that are chemisorbed on the copper surface can be 
determined. The number of exposed surface copper atoms and the copper surface 
area can then be derived. Quoted surface areas are calculated using discharged 
sample mass. Note that recent work*”** has shown that, if the catalyst is exposed 
to partial pressures of H2 exceeding 0.05 bar, then partial reduction of ZnO at the 
copper interface can occur. This will affect copper surface area results, owing to 
N20 oxidizing both copper and ZnO,. In these cases, alternative techniques, such 
as a H; thermal conductivity detector, will give more accurate data with respect 
to copper surface area*””**, 

Copper surface area analysis after exposure to LTS conditions. Copper surface area 
analysis of the fresh catalysts were carried out on a Quantachrome ChemBET 3000. 
Sample (100 mg) was packed into a stainless-steel U-tube and purged with high- 
purity helium for 5 minutes. Samples were reduced using 10% H>/Ar (30 ml min!) 
heated to 140°C at 10°C min“!, before heating further to 225°C at 1°C min” !. 
The resulting catalyst was held at this temperature for 20 minutes to ensure that 
complete reduction took place. Note that, under this partial pressure of Hp, it has 
been reported that ZnO species in contact with copper can partially reduce*”**. 
Residual H; was flushed from the system by switching the gas line back over to 
helium (80 ml min~'), while holding the sample at 220°C for another 10 minutes. 
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The temperature was then reduced to 65°C for N2O pulsing (BOC AA Grade). 
A programme of 12 pulses of 113 11 N20 with a 5-minute stabilization time between 
pulses was carried out, followed by three pulses of N> for calibration. Unreacted 
NO was trapped before reaching the detector using a molecular sieve 5A (pelleted, 
1.6mm, Sigma Aldrich) trap. The copper surface area was determined from the 
amount of N» emitted and the post-reaction analysis of catalyst mass, as follows: 

Copper surface area (m?g~!) =N> volume (ml) x Na x 2/Catalyst mass 
(g) x 24,000 (ml) x (1.47 10!) (atoms m~*) 
where Na is the Avogadro constant, 6.022 x 1073 (atoms). The key assumptions are 
that the amount of N2 emitted amounts to half a monolayer’s coverage of oxygen, 
and that the surface density of copper is 1.47 x 10!” (atomsm ”). The volume of 
N> produced was quantified using a TCD. 

After copper surface area analysis, the samples were kept under N, and trans- 
ferred to an LTS reactor. Ageing was carried out in a single fixed-bed reactor 
equipped with a bypass line. CO, N2, CO2 and H; were introduced to the catalyst 
bed via mass-flow controllers (Bronkhorst). Water of high-performance liquid 
chromatography (HPLC) grade was passed through a liquid-flow controller 
(Bronkhorst) and then into a controlled evaporator mixer (Bronkhorst) that was 
heated to 140°C. N was fed through the vaporized water to give a dilute syngas 
mixture (H,O0/CO/CO3/H2/N2 = 25/1/4/13.75/56.25). This mixture was intro- 
duced at 220°C after re-reduction of the catalyst. The gas flows were controlled 
to achieve an MHSV of 30,0001h ‘kg. After ageing for 40 hours, the samples 
were transferred back to the Quantachrome ChemBET Chemisorption analyser, 
whereby, after re-reduction, the copper surface areas of the aged catalysts were 
measured, as described above. 

Scanning transmission electron microscopy (STEM). Samples for STEM character- 
ization were dry-dispersed onto holey carbon TEM grids. They were examined 
in an aberration-corrected JEOL ARM-200CF scanning transmission electron 
microscope operating at 200kV in bright-field and dark-field STEM imaging 
modes. Reliable electron microscopy results could be obtained only from the set 
of zincian-georgeite- and zincian-malachite-derived materials, as these were found 
to be stable under the vacuum environment of the microscope, and largely unaf- 
fected by electron-beam irradiation. By way of contrast, the copper-only georgeite 
precursor materials were highly unstable under vacuum conditions (even without 
electron-beam irradiation), turning from a blue to dark-green colour, probably 
owing to the loss of occluded water. The corresponding zincian-georgeite materials 
showed no such colour transformation under vacuum. 

Environmental transmission electron microscopy (ETEM). Samples for reduc- 
tion during characterization by ETEM were dry-dispersed on heater chips 
(DensSolution trough hole) and then mounted on a DensSolution SH30 heat- 
ing holder. The holder with sample was inserted into an FEI Titan 80-300 envi- 
ronmental transmission electron microscope operated at 300kV (ref. 39) . The 
reduction of the samples was performed in situ as follows, for both the georgeite 
and the malachite precursors: a flow of H2 was let into the microscope, building 
up a pressure of 2 mbar. The sample was heated from room temperature to 150°C 
using a heating ramp rate of 10°C min. The final heating to 225°C was done at 
1°Cmin~!. The oxidation state of copper was monitored by electron energy-loss 
spectroscopy (EELS) during the reduction treatment using a Gatan Tridiem 866 
spectrometer attached to the microscope. After an extended in situ treatment of 
several hours at 225°C, phase-contrast lattice imaging was performed at elevated 
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temperature in an H atmosphere for both the georgeite and malachite samples, 
and recorded using a Gatan US 1000 charge-coupled-device camera. 

X-ray photoelectron spectroscopy (XPS). A Kratos Axis Ultra DLD system was used 
to collect XPS spectra, using a monochromatic Al K, X-ray source operating at 
120 W. Data were collected with pass energies of 160 eV for survey spectra, and 
40 eV for the high-resolution scans. The system was operated in the Hybrid mode, 
using a combination of magnetic immersion and electrostatic lenses and acquired 
over an area approximately 300 x 700 1m”. A magnetically confined charge- 
compensation system was used to minimize charging of the sample surface, and the 
resulting spectra were calibrated to the C(1s) line at 284.8 eV; all spectra were taken 
with a 90° take-off angle. A base pressure of ~1 x 10-° Torr was maintained during 
collection of the spectra. Gas treatments were performed in a Kratos catalysis cell, 
which mimics the conditions of a normal reactor vessel, allowing the re-creation 
of reactor conditions and analysis of the chemical changes taking place on the 
catalyst surface. Briefly, the cell consists of a fused quartz reactor vessel contained 
within a stainless-steel vacuum chamber (base pressure ~10-* Torr after baking). 
Samples were heated at a controlled ramp rate of 2°C min! to a temperature of 
225°C using a eurotherm controller. The catalysts were exposed to an atmosphere 
of 2% Hp in nitrogen with a flow rate of 30ml min “', controlled using MKS mass 
flow controllers during the heating ramp, during the 20-minute isotherm at 225°C, 
and also while the catalyst was cooled to 25°C. The samples were analysed before 
and after gas treatment without breaking the vacuum. 

Inductively coupled plasma mass spectrometry (ICP-MS) and carbon-hydrogen- 
nitrogen (CHN) analysis. These analyses were provided as a commercial service 
by Warwick Analytical Services. 

Helium pycnometry. This analysis was provided as a commercial service by MCA 
Services. 
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Extended Data Figure 1 | FT-IR and XRD characterization of SAS positions in wavenumbers (cm~'). *The presence of this band shows that 
copper and copper/zinc acetate precipitates. Key: i, SAS-prepared SAS precipitation with no additional water co-solvent produced some 
copper acetate; ii, SAS-prepared georgeite; iii, malachite prepared by co- georgeite as well as amorphous copper acetate. We attribute the formation 
precipitation; iv, 2/1 copper/zinc malachite prepared by co-precipitation; of georgeite to the small amount of water present from the monohydrated 
v, SAS-prepared 2/1 copper/zinc georgeite; vi, SAS-prepared 1/1 copper/ starting salt. d, IR spectrum of mineralogical georgeite, reproduced with 


zinc georgeite; vii, SAS-prepared zmithsonite (ZnCO3). a, XRD analysis of the permission of the Mineralogical Society of Great Britain and Ireland 


copper-only samples. b, FT-IR analysis of copper-only samples. c, FT-IR from ref. 11. e, FT-IR spectra of copper/zinc samples. f, XRD of copper/ 
band assignment of copper-only samples, with (reference) designated as zinc samples. 
received copper(I) acetate monohydrate. Values given are for IR band 
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a 
Elemental composition Densit Na 
enst'y content 
Sample 
ou a e H (wt.%) Calculated formula gcm 4 
(wt.%) (wt.%) (wt.%) . HGna9 
SAS copper (II) acetate 46.6 - 12.9 2.3 - 2.50 - 


Cur.s(COs)s(OH), 1-4.6H20 
SAS georgeite 50.4 - 6.8 1.5 3.06 - 
Cur(COs)s(OH),.5H,0 


Mineralogical georgeite 47.3 - 5.3 2.4 Cus(CO3)3(OH)4,.6H2O0 2.6* - 


Georgeite synthesised 57.8 3 55 0.9 Cu,CO,(OH), 2.4-2.8* : 
from sulphates 


(Cup 662No.34)s.2(CO3)3(OH)4.3.3.4H20 3.03 
34.3 17.5 5.6 1.2 66 
(Cuo.66Z2No.34)s(CO3)3(OH)4.3H20 


2:1 Cu:Zn SAS zincian 
georgeite 


Cur ;(CO3)5(OH),.3.3.5H20 
24.1 27.8 6.7 1.3 - - 
Cur, (CO3)s(OH)4.4H20 


1:1 Cu:ZnSAS zincian 
georgeite 


(Cu), 9 (CO3)(OH);.90.3H20 
Co-precipitated malachite 55.4 - 5.4 1.0 3.91 - 
(Cu)2(CO3)(OH)2 


os (Cu; 5ZNo7)(COs)(OH)2 2 
Co-precipitated (2:1 Cu:Zn 38.4 178 50 09 3.85 498 


malachite 
) (Cup 66ZNo.34 )2(CO3)(OH)2 
b c 
Calculated Experimentally Calculated/ 100 
Sample massloss* observed mass Experimental 95 
(%) loss (%) 
SAS s 
georgeite a6 a ms S 85 
n 
SAS 2 80 
zincian 32 35 91 E 5 
georgeite 
70 
malachite 30 31 97 
65 
Zincian 28 30 93 60 
malachite 
0 100 200 300 400 500 600 
Temperature (°C) 
Extended Data Figure 2 | Elemental composition of copper and a skeletal density that negates buoyancy effects. b, Comparison of 
copper/zinc samples, with supplementary TGA analysis. a, Elemental experimental and calculated mass losses for georgeite and malachite from 
composition of SAS-prepared georgeite and co-precipitated malachite TGA measurements. *Calculated from the assumption of final products 
samples. Elemental composition was determined by CHN analysis and being 2/1 copper oxide/zinc oxide. The lower-than-expected mass losses 
ICP-MS. Densities were determined by helium pycnometry. *Values for the zincian phases could be associated with the inclusion of copper 
from ref. 11. 'Values from ref. 16. *Density determined by sink-float ions in the zinc oxide lattice. c, TGA analysis of 2/1 copper/zinc georgeite 
(SF) method. The helium pycnometry used in our present study provides (red line) and 1/1 copper/zinc georgeite (black line). 
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Extended Data Figure 3 | Representative STEM micrographs of the 
zincian georgeite precursor. Left, dark-field (DF) STEM micrograph. 
Right, bright-field (BF) micrograph. The general morphology of the 
zincian-georgeite precursor is shown in the DF-STEM micrograph. It 
typically consists of very characteristic, irregularly shaped agglomerates, 
about 100-200 nm in diameter, that are composed of ‘amorphous’ non- 
faceted particles about 40 nm wide. Closer inspection by BF-STEM shows 
that these non-faceted particles consist of an amorphous matrix phase 
in which are embedded largely disconnected, sub-2-nm crystallites of 
ordered material exhibiting clear lattice fringes. The amorphous matrix, 
probably containing the carbonate and hydroxyl species, is by far the 


majority phase, while the nanocrystallites make up less than 10% of 

the material by volume. This observation is consistent with our other 
characterization data, as signals from the matrix phase would dominate 
the XAFS analysis, whereas the nanocrystallites are too small in dimension 
to be detected by XRD. Analysis of fringe spacings and interplanar angles 
of individual nanocrystallites from such images suggests that some of the 
grains could be CuO (tenorite, where the copper is fourfold coordinated by 
oxygen), while others fit better to Cu2O (cuprite, where the copper has a 
coordination number of 2). No convincing matches of the lattice fringes to 
either ZnO or ZnCO;3 could be found. 
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Methanol synthesis productivities normalised LTS activities normalised by copper surface 
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2:1 Cu:Zn Zincian georgeite 112.6 73.9 1.62 nla 
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1:1 Cu:Zn Zincian georgeite 775 57.0 172 nla 
derived 
2:1 Cu:Zn Zincian malachite 
derived 80.2 74.1 2.21 n/a 
Industrial catalysts 90.7 89.7 1.76 n/a 


Extended Data Figure 4 | Further catalyst testing in methanol-synthesis 
and LTS reactions. Filled red triangles, 2/1 Cu/Zn georgeite; open 
triangles, 1/1 Cu/Zn georgeite; open squares, industrial standard; filled 
blue squares, 2/1 Cu/Zn malachite (this test was terminated after 190°C 
because of poor activity). a, b, Methanol-synthesis data, normalized 

to total catalyst mass (a) and copper mass (b), at 190-250 °C. The 
dashed line shows the representative reactor-bed temperature. Reaction 
conditions were as follows: pressure 25 bar; gas composition CO/CO)/ 
H)/N2 = 6/9.2/67/17.8; MHSV =7,2001 kg-'h~!. c, Concentration of 
byproducts that were collected in the condensate pot after the methanol- 
synthesis reaction, as determined by gas-chromatographic (GC) analysis. 
Byproducts are: ethanol (blue), propanol (red), butanol (green), iso- 
butanol (purple), and methyl iso-butyl ketone (turquoise). d, e, LTS- 
reaction data, normalized to total catalyst mass (d) and copper mass 

(e). Reaction conditions were as follows: temperature 220°C; pressure 


27.5 bar; gas composition HyO/CO/CO2/H2/N2 = 50/2/8/27.35/12.5; 
MHSV = 75,0001 kg~'h~!. f, Concentration of methanol collected in 

the condensate pot after the LTS reaction, as determined by GC analysis. 
g, Methanol-synthesis productivities and LTS activities, normalized by 
copper surface area. *Copper surface area analysis determined by N20 
reactive frontal chromatography before testing. ‘Methanol-synthesis data 
acquired at 190°C, with steady state being at 18 hours’ time-on-line. \LTS 
data acquired at 220°C, with steady state being at 40 hours time-on-line. 
Note that LTS simulation testing showed that copper surface area dropped 
markedly after 40 hours (to 17m?g~! and 19m?g"! for zincian- 
georgeite-derived and zincian-malachite-derived catalysts, respectively). 
The inverse correlation between copper surface area and initial 
productivity also suggests that loss of surface area occurs rapidly during 
the reaction and that the initial rate data are therefore likely to be 
inaccurate. 
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Extended Data Figure 5 | Spectroscopic analysis of the addition of 
zinc to georgeite. a, Diffuse reflectance UV-vis spectra of zincian 
malachite (red) and zincian georgeite (black). b, Copper K-edge EXAFS 
(x) comparison of zincian georgeite (with a 4/1 or 2/1 copper/zinc ratio) 
with georgeite. c, Zinc K-edge XANES comparison of zincian georgeite 


30 40 50 
r(A) 


with SAS-prepared smithsonite. d, Zinc K-edge EXAFS (x) comparison of 
zincian georgeite with SAS-prepared smithsonite. e, Comparison of observed 
georgeite PDF data with simulated data for crystalline hydroxycarbonate 
minerals with similar compositions to that of georgeite—namely aurichalcite, 
azurite, rosasite, zincian malachite and malachite. 
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Extended Data Figure 6 | Representative DF-STEM and BF-STEM 
micrographs of zincian georgeite and zincian malachite, calcined at 
300°C. a, DF-STEM of calcined zincian georgeite. Higher-magnification 
imaging reveals that, after much of the carbonate and hydroxy] content is 
lost by calcination, most of the disordered matrix material in the precursor 
has crystallized, and only a small amount of amorphous material remains. 
The crystallized material is entirely in a nanocrystalline form, with a mean 


grain diameter of 3-4nm, which is just below the detection limit for XRD. 
Analysis of the fringe spacings and interplanar angles from individual grains 
suggests that the material is now mainly an intimate mixture of zinc and 
copper oxides; the small amount of disordered material corresponds to the 
residual occluded carbonate material, as detected by TGA/EGA analysis 

of this material. b, BF-STEM of calcined zincian georgeite. c, DF-STEM of 
calcined zincian malachite. d, BF-STEM of calcined zincian malachite. 
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Extended Data Figure 7 | X-ray diffraction analysis of calcined zincian as open circles; the fit is a solid black line; the difference is a grey line. Both 
georgeite and zincian malachite. a, PDF of zincian georgeite, as prepared techniques determine the product to be a mixture of copper oxide and 
and after calcination at 120°C, 200°C, 250°C, 300°C and 450°C. There zinc oxide (weight ratio of 68/32 by PDF; 67.7(4)/32.3(4) by Rietveld). 
is little change in the observed PDF up to 250°C, other than a slight peak d, Ex situ XRD patterns following calcination of zincian georgeite for 
broadening, which can be attributed to a reduction in short-range order. 2 hours at 250°C, 300°C and 450°C. Open circles, copper oxide; filled 
The dashed line shows the position of the C-O peak, which is retained squares, zinc oxide. e, f, In situ XRD analysis of zincian georgeite (d) and 
until temperatures higher than 300°C. b, PDF and ¢, Rietveld fits of zincian malachite (e) during calcination between 300°C and 500°C under 


zincian georgeite after calcination at 450°C. The measured data are shown —_ an atmosphere of static air, with XRD scans every 25°C. 
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Extended Data Figure 8 | Copper K-edge XAFS analysis of zincian calcined zincian georgeite (black) and zincian georgeite (blue). c-e, Linear 
georgeite and zincian malachite calcined at 300°C. a, EXAFS (x) combination fit of copper oxide and zinc oxide with zincian malachite 
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and georgeite. b, EXAFS (R) comparison of copper oxide (green), 
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cur 8(1)  5.18(2) 0.010 (1) Zn-O 7(2)  3.76(5) 0.004 (5) 
Zn-Zn 5 (3) 4.58 (6) 0.010 (5) 
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Cu 2p 3/2 Zn 2p 3/2 
Sample Cu/(Cu+Zn) 
Position (eV) Atom% Position (eV) Atom% 


Zn georgeite Calcined 934.12 2.18 027.82 0.08 0.55 


Reduced 931.86 14.02 1021.26 28.8 0.33 
Zn malachite Calcined 933.71 13.02 1021.71 16.56 0.48 
Reduced 932.11 17.96 1021.41 37.16 0.33 


Extended Data Figure 9 | In situ characterization of final-state, reduced were acquired in 2 mbar H) at 225°C, and show the fine structure of the 
copper/zinc-oxide catalysts derived from zincian georgeite and zincian Cu L)3 ionization edges, which are characteristic of metallic Cu’. 


malachite, calcined at 300°C. a, XRD of zincian-georgeite-derived g, EXAFS fitting data for copper and zinc K-edge data. Fitting parameters 
catalyst after in situ hydrogen reduction (2% H2/N> at 225°C for 1 hour). for K-edge copper: amplitude-reduction factor (So*) = 0.91, as deduced 
Fine dotted lines indicate zinc-oxide reflections; dashed lines indicate from copper-foil standard; fit range 3<k<11.2, 1<R<5.5 (with k denoting 
metallic copper. b, Copper K-edge EXAFS Fourier transorm of final the fitting window from the x data, and R, the path length, denoting the 
reduced catalysts derived from zincian georgeite and zincian malachite. fitting from the Fourier transform of the x data); number of independent 


c, Copper K-edge EXAFS fit of reduced catalysts. d, Zinc K-edge EXAFS points = 23; *denotes multiple scattering path. Fitting parameters for 
Fourier transform of final reduced catalysts derived from zincian georgeite | K-edge Zn: sj =0.90 as deduced from a ZnO, standard; fit range 


and zincian malachite. e, Zinc K-edge EXAFS fit of reduced catalysts. 3.3<k<9.5, 1<R<4.8; number of independent points = 16; *denotes 
f, Electron energy-loss spectra (EELS) of georgeite and malachite, multiple scattering path. h, XPS analysis of calcined and reduced catalysts 
respectively, during reduction in the ETEM experiment. The EELS data derived from zincian georgeite and zincian malachite. 
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Extended Data Table 1 | EXAFS fitting parameters for georgeite and malachite 
AR 


Abs Sc N R (A) 20° (A’) E, (eV) Rractor 
Malachite 
Cu-O 4 (fixed) 1.92 (2) 0.007 (2) 
Cu-O 2 (fixed) 2.45 (7) 0.02 (1) (1) 0.014 
Cu-C 3 (fixed) 3.01 (4) 0.002 (6) 
Cu-—Cu 2 (fixed) 3.12 (4) 0.006 (5) 
Cu-Cu 3 (fixed) 3.31 (3) 0.007 (3) 
Georgeite 
Cu-O 4 (fixed) 1.94 (1) 0.004 (1) 2(1) 0.007 
Cu-O 2 (fixed) 2.45 (fixed) 0.03 (fixed) 


Fitting parameters for malachite: So?=0.72 as deduced from a copper-oxide standard; fit range 3<k<12, 1<R<3.5; number of independent points = 14. Fitting parameters for georgeite: s2=0.72 as 
deduced from a copper-oxide standard; fit range 3<k<12, 1<R<3.5; number of independent points = 14. Abs Sc; scattering path, N; coordination number (first shell); R, path length, o: Debye-Waller 
factor, Es; change in edge energy, Rractor; goodness of fit. 
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Electrostatic catalysis of a Diels— Alder reaction 


Albert C. Aragonés!*?, Naomi L. Haworth*, Nadim Darwish! , Simone Ciampi°, Nathaniel J. Bloomfield*, Gordon G. Wallace’, 


Ismael Diez-Perez!?? & Michelle L. Coote* 


It is often thought that the ability to control reaction rates with an 
applied electrical potential gradient is unique to redox systems. 
However, recent theoretical studies suggest that oriented electric 
fields could affect the outcomes of a range of chemical reactions, 
regardless of whether a redox system is involved'~*. This possibility 
arises because many formally covalent species can be stabilized via 
minor charge-separated resonance contributors. When an applied 
electric field is aligned in such a way as to electrostatically stabilize 
one of these minor forms, the degree of resonance increases, 
resulting in the overall stabilization of the molecule or transition 
state. This means that it should be possible to manipulate the 
kinetics and thermodynamics of non-redox processes using an 
external electric field, as long as the orientation of the approaching 
reactants with respect to the field stimulus can be controlled. Here, 
we provide experimental evidence that the formation of carbon- 
carbon bonds is accelerated by an electric field. We have designed 
a surface model system to probe the Diels-Alder reaction, and 
coupled it with a scanning tunnelling microscopy break-junction 
approach’, This technique, performed at the single-molecule 
level, is perfectly suited to deliver an electric-field stimulus across 
approaching reactants. We find a fivefold increase in the frequency 
of formation of single-molecule junctions, resulting from the 
reaction that occurs when the electric field is present and aligned 
so as to favour electron flow from the dienophile to the diene. Our 
results are qualitatively consistent with those predicted by quantum- 
chemical calculations in a theoretical model of this system, and 
herald a new approach to chemical catalysis. 

All chemical reactions can be viewed as the movement of electrons 
and/or nuclei; as such, one might expect that their kinetics and thermo- 
dynamics could be influenced by external electric fields. Even for 
non-redox reactions, theoretical studies predict that electrostatic effects 
should in principle influence the stabilities of chemical species by stabi- 
lizing or destabilizing charge-separated resonance contributors‘. For 
example, Shaik and co-workers have argued that a covalent bond of the 
form X-Y can be thought of as comprising several possible resonance 
contributors, [KY < X+Y~ — X~Y*]®. In the absence of an electric 
field, the extent to which either of the charge-separated structures will 
contribute to the resonance stabilization of the bond will depend on 
the relative electronegativities of X and Y. Indeed, Shaik and colleagues 
were able to use this concept to explain energy trends in the group 
14 MH;-Cl bonds", among other examples. In this context, the pres- 
ence of an appropriately oriented external electric field has the potential 
to further stabilize or destabilize these charge-transfer contributors, and 
thereby influence bond energy. Moreover, the participation of minor 
charge-separated resonance structures is not limited to covalent-bond 
energies; it has been invoked to explain trends in kinetics and thermo- 
dynamics for a wide range of chemical reactions, including radical 
addition! and transfer'?. Thus, in principle, the scope of electrostatic 
catalysis—manipulating chemical reactions with electric fields—should 
be broad. 


Electrostatic catalysis is the least developed form of catalysis in syn- 
thetic chemistry (even though it is widely harnessed by enzymes'*~'*). 
This is because electrostatic effects are strongly directional and are 
effectively quenched in polar media. Enzymes overcome these prob- 
lems by using a low-polarity active site, in which the substrate binds in 
a precisely oriented manner; one or more charged residues within this 
site can then create an oriented local electric field that can catalyse the 
reaction. In synthetic chemistry, one can mimic this process to some 
extent by using charged functional groups on the substrate or catalyst; 
however, balancing the need for low solvent polarity with the limited 
solubility of charged residues in non-polar solvents leads to compro- 
mises that weaken the catalytic effect. For example, aminoxyl radicals 
(R,R,NO’) are stabilized via resonance with R;|R,N**O~. Thus, when a 
(remote) negatively charged functional group is placed on the left-hand 
side of the N-O’ bond, the electrostatic stabilization of this minor con- 
tributor leads to further stabilization of the species”? . This stabilization, 
which has been verified experimentally, promotes dissociation of the 
R,R2NO-H and RjR2NO-R bonds by as much as 20kJ mol7! in the 
gas phase”; however, the effect, while still of practical significance as a 
‘pH switch of radical stability, is effectively halved in energetic terms 
in low-polarity solvents such as dichloromethane’’, and essentially 
quenched in polar solvents’®. 

If one could use external electric fields instead of charged chemical 
species as the ‘catalyst, one could manipulate a much broader range 
of reactions, conveniently altering both reactivity and selectivity in a 
tunable manner that can be predicted by theory. However, to probe 
this concept experimentally, one must develop a method of controlling 
the orientation of the external electric field (EEF) with respect to the 
reaction centre. Previously, EEFs have been used to guide the selec- 
tivity of isomerization reactions in which polar intermediates or tran- 
sition states are involved!®”; but controlling the orientation of the 
EEF as two molecules collide in bimolecular reactions adds another 
dimension to the problem. Here we show that this can be achieved by 
combining surface chemistry procedures with state-of-the-art single- 
molecule techniques that are based on scanning tunnelling 
microscopy (STM). STM-based single-molecule electrical meas- 
urements can reveal information on chemical coupling averaged 
over thousands of collisions. This gives us the ability to control the 
dynamics of the approaching reactants and deliver the field stimulus 
upon collision. Using this approach, we show that a simple, textbook 
bimolecular carbon-carbon bond-forming reaction, the Diels-Alder 
reaction—involving reagents of ostensibly negligible polarity—can be 
accelerated by an oriented EEF. 

Diels-Alder reactions—which involve a conjugated diene and a 
substituted alkene (the ‘dienophile’)—constitute a major family of 
chemical processes that are used in the preparation of fine chemicals”'. 
Our choice of these reactions was inspired by theoretical predic- 
tions by Shaik and colleagues', who suggested that the barrier 
heights for certain Diels-Alder reactions can be lowered substan- 
tially when an electric field is oriented appropriately. Here we use a 
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Figure 1 | Electrostatic catalysis of a Diels-Alder reaction. a, We studied 
the effects of an external electric field on the reaction rate by using single- 
molecule STM-BJ conductance measurements, which provide the oriented 
electric-field stimulus and also count reaction events. The diene (a furan) 
is attached to the STM tip via a thiol group (‘S’); the dienophile 

(a norbornylogous bridge) is attached in a known orientation” to a flat gold 
surface via two thiols. Four structurally distinct products may be formed, 
each having two diastereoisomers; the kinetically favoured product is 
shown here. AV is the voltage difference between the tip and surface 
electrodes. b, Possible resonance structures of the transition state. When 
an electric field is present, minor contributors I or III may be stabilized 
enough to undergo resonance with II, lowering the reaction barrier. The 
vertical arrows show the field direction most likely to stabilize I or HI, with 
I expected to experience greater stabilization at a given field magnitude. 


surface-tethered furan derivative as the diene, and a norbornylogous 
bridge with a terminal double bond as a non-polar dienophile ((+)- 
NB, tricyclo[4.2.1 .0?°|non-7-ene-3,4-dimethanethiol; only the 
(1(R),2(R),3(R),4(R),5(S),6(S))-enantiomeric form is shown in Fig. 1a). 
Norbornylogous bridges are conformationally rigid molecules, and 
have been extensively used as electrical conduits for probing how 
geometrical and structural factors influence chemical and electro- 
chemical phenomena”*4, The NB in Fig. la isa short, rigid, non-polar 
dienophile with two CH2SH groups (feet) in trans-stereochemistry; 
the presence of these feet allows unambiguous orientation of the distal 
double bond (dienophile) when assembled on flat gold surfaces”*>”°. 
The rigidity of NB helps us to position and align the dienophile with 
respect to the EEF when the diene part of the system is brought nearby. 
The molecular length of the dienophile is kept to a minimum (five 
sigma bonds) in order to maintain the Diels-Alder product within 
the conductance limit of our system. 

To ascertain whether this Diels-Alder reaction is sensitive to the 
presence of an oriented EEK, we used quantum chemistry to study the 
field effect on the reaction barrier. This reaction has four structurally 
distinct Diels-Alder products; results for the reaction with the lowest 
barrier (exo-syn) are shown in Fig. 2 (see Supplementary Information 
for full results). There are two diastereoisomers for each of the four 
products, with the substituents of the furan being located either on the 
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Figure 2 | Computational modelling of the Diels-Alder reaction. 

a, The two diastereoisomers of the exo-syn product of this reaction. 
These were the kinetically favoured products; the six other possible 
products had much higher reaction barriers over the experimental 

range of field strengths. In the blue diastereomer, the substituents of the 
furan are located on the left of the molecule; in the red diastereomer, 
these substituents are located on the right. b, The coordinate axes used 
to orient the field with respect to the molecule. The z axis lies along the 
double bond of the dienophile, while the x axis is directed along the NB 
backbone. c, The scenario being modelled, showing the NB bridge sitting 
in the experimentally determined orientation with respect to the surface 
of the STM plate and to the electric field lines, which are passing through 
the reaction centre at an oblique angle to the NB double bond. d, The 
predicted effects of the strength and direction of the external electric 
field (EEF) on the reaction-barrier height (AE*) for formation of the 

two exo-syn diastereoisomers in a (see Supplementary Information). 
The formation of the blue structure is quite insensitive to the EEF over 
the experimental range of field strengths, while the formation of the red 
structure shows strong field sensitivity. The diagrams in the bottom right 
corner show the possible directions of the electric field (where Egurface 
denotes the voltage applied between the surface and the tip). 


left of the molecule (see the blue diastereomer in Fig. 2a) or on the right 
(the red diastereomer). 

The positioning of these substituents leads to different interactions 
with the CH2SH groups at the opposite end of the NB, resulting in 
slightly different energies when no EEF is present and very different 
responses to the applied field. Experimentally, the NB is known to sit at 
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an angle to the gold surface”, tilted by 30° along the y axis and 25° along 
the z axis (Fig. 2c). For the exo-syn product, the 30° y-axis tilt means 
that the field lines are oriented roughly along the average vector of the 
forming bonds. However, the z-axis tilt means that the furan SH group 
sits above the forming bonds for the blue diastereoisomer (Fig. 2c), 
but to the side of the molecule for the red diastereoisomer. As a result, 
the blue structure is predicted to be quite insensitive to the EEF over 
the experimental range of field strengths, while the red structure should 
show strong field sensitivity (see Supplementary Information, section 3). 
Specifically, it is predicted that, for negative bias, the barrier to for- 
mation of the red isomer will decrease with increasing field strength; 
for positive bias, it is predicted to increase (Fig. 2d). This trend occurs 
because the negatively biased EEF can stabilize resonance contributor I 
(Fig. 1b); a positively biased field will destabilize it. In principle, a pos- 
itively biased field should also lower the barrier height by stabilizing 
resonance contributor III. However, configuration III has much less 
inherent stability than contributor I, as the electronegative oxygen 
prefers to bear a negative rather than a positive charge. Asa result, this 
configuration contributes only at strong positive fields, outside of the 
experimental range (see Supplementary Figs 3-8). Within the experi- 
mental range, our calculations predict that the frequency of adduct for- 
mation should increase systematically as the strength of the negatively 
biased field increases, up to a factor of 1.5 at —0.75 V, while remaining 
relatively independent of EEF strength for positively biased fields. 

To test these predictions experimentally, we attached the NB to 
the surface of a flat gold electrode, and the furan to the STM gold tip. 
We then undertook a series of STM break-junction approach (BJ) 
experiments known as ‘blinking’””’. The blinking technique detects 
the formation of molecular bridges between an STM tip and a sub- 
strate electrode—while they are fixed at a specific electrode—electrode 
distance—by imposing an initial set-point tunnelling current (Fig. 3 
and Supplementary Information, section 2). After the set-point current 
is reached, the feedback loop is turned off and the current is monitored. 
Current jumps (blinks) appear when a molecular bridge spans the gap 
between the electrodes (Fig. 3c). We detected blinks in conductance of 
magnitude (1.5-2) x 10~°Gp when the tip and substrate were separated 
by a distance that allows the Diels-Alder reaction to occur (about 1 nm). 
These junctions formed only when both reactants were present. When 
either reactant was removed from its respective electrode, or when their 
saturated analogues were used (2-methyl-3-tetrahydrofuranthiol on 
the tip, or a hydrogenated version of NB on the surface—a system 
that is structurally identical but which lacks the diene-dienophile 
character), there was no evidence of molecular-bridge formation 
(Supplementary Information, section 2). Hence, the junctions are 
formed through the Diels-Alder reaction. At positive voltage biases (sur- 
face positive), the frequency of molecular-bridge formation is constant 
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Figure 3 | Blinking experiments. a, Diagram 
of the STM tip and surface during a blinking 
experiment. The STM tip was modified by 
furan molecules and the surface with NB 
molecules, using self-assembled-monolayer 
procedures (see Supplementary Information). 
b, The stages encountered during a blinking 
event. c, A real-time data capture of blinking 
events. The time breaks in the x axis are about 
2 min. The inset shows the STM current 
response before (1), during (2) and after (3) 
the formation of a single blink (junction). 

d, Two-dimensional maps overlaying hundreds 
of blinks. Counts have been normalized to 

a colour scale, with 100 counts representing 
the maximum, and 0 representing the 
minimum. The surface bias was —0.5 V inc, d. 
Go = 2e7/h=77.5 uS, with h the Planck constant 
and e the elementary charge. 
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(five blinks per hour) over a wide range of applied biases. In contrast, 
at negative biases the frequency is clearly affected by the strength of the 
field, and increases from five blinks per hour at a bias of —0.05 V, to 25 
blinks per hour at a bias of —0.75 V (Fig. 4). These trends are in com- 
plete qualitative agreement with the theoretical predictions in Fig. 2d. 
Quantitatively, there are differences, which may relate to the differ- 
ence in the realms being studied experimentally and computationally 
(single-molecule reaction rates versus bulk reaction rates), and/or the 
use of pre-complexes in calculating field effects on barrier heights 
(see Supplementary Information). 

As further validation that the experiment is detecting the forma- 
tion of carbon-carbon bonds, we note that the average lifetime of 
the blinks was 0.4s (Fig. 3d), with poor dependence on the electric- 
field magnitude for positive biases up to 0.75 V (Supplementary 
Information, section 2). This lifetime is around the same as that 
observed for standard single-molecule wires that are thiolated at both 
ends®*. A bias value higher than +0.75 V or lower than —0.75 V led 
to a drop in the lifetime of the junctions, owing to the instability of the 
gold-sulfur contacts. Hence, we kept the upper limit of the bias within 
the range +0.75 V to —0.75 V, to allow us to compare different biases 
while maintaining similar junction stabilities. We also confirmed the 
formation of mechanically stable carbon-carbon bonds by collecting 
pulling curves during the blinking events (Supplementary Figs 2-6), 
or by performing pushing/pulling cycles (Supplementary Figs 2-7). 
Pulling curves collected over the blinks showed a plateau with an 
average pulling length of 0.2 nm, which corresponds to the stretching 
of the single-molecule bridge. When the pulling was exerted over the 
tunnelling background or on random noise (Supplementary Figs 2-6), 
a clean exponential decay was observed, testifying that the above blinks 
are attributable to a stable molecular junction, rather than resulting 
from the migration of gold atoms or from fluctuations in molecular 
conformations”*®””. 

We further confirmed that robust molecular junctions were formed 
using an STM-BJ approach>*° referred to as tapping. Here, the 
furan-modified tip was repeatedly driven into and out of contact with 
the NB-modified substrate (Supplementary Figs 2-7): when the reac- 
tants were mechanically brought together, molecular junctions formed; 
when the tip was pulled away, the junctions broke. During the pull- 
ing portion of the cycles, we detected plateaus in the current-versus- 
distance curves of the same conductance magnitude as that observed 
in the blinking experiments ((1.5-2) x 10~°Gpo; Supplementary 
Information, section 2 and Supplementary Figs 2-7), further support- 
ing the idea that stable molecular wires form when the two reactants are 
brought together under an electric field. Moreover, changes to the mag- 
nitude and direction of the field applied across the reactants affected the 
rate of the Diels-Alder reaction in a similar manner to that found via 
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Figure 4 | Frequency of blinks (junctions) as a function of the applied 
bias. Positive and negative biases are plotted in red and blue, respectively. 
To keep the distance between the surface and the tip constant across the 
bias range, we used the same set-point current and changed the bias while 
the STM feedback was turned off. We performed blinking experiments over 
periods of one hour. At the end of each one-hour period, we changed the 
furan-modified STM tip and its lateral position with respect to the surface to 
compensate for the loss of reactants. We repeated this procedure eight times, 
giving each bias point (magnitude and direction) a total experimental time 
of eight hours. We changed the chronology of the selected biases randomly 
for each repeat. Error bars represent the standard deviation from the eight 
(one-hour) intervals. The dashed lines are included for visual guidance. 


blinking. Tapping data show that product formation increases by 4.4- 
fold, from 4.2% (252 product molecules out of 6,000 attempts) to 18.6% 
(1,116 product molecules out of 6,000 attempts) when the surface bias 
is changed from —0.05 V to —0.75 V (Supplementary Figs 2-7). 

We have presented the first (to our knowledge) experimental 
evidence of a non-redox, bond-forming process being acceler- 
ated by an oriented EEF. Our experimental results are qualitatively 
consistent with theoretical calculations, and result from the abil- 
ity of the electric field to electrostatically stabilize a minor charge- 
separated resonance contributor of the transition state. This ability to 
manipulate chemical reactions with electric fields offers proof-of-principle 
for a change in our approach to heterogeneous catalysis. 
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Upper- plate controls on co-seismic slip in the 
2011 magnitude 9.0 Tohoku-oki earthquake 


Dan Bassett!, David T. Sandwell', Yuri Fialko! & Anthony B. Watts? 


The March 2011 Tohoku-oki earthquake was only the second 
giant (moment magnitude M,, > 9.0) earthquake to occur in the 
last 50 years and is the most recent to be recorded using modern 
geophysical techniques. Available data place high-resolution 
constraints on the kinematics of earthquake rupture', which have 
challenged prior knowledge about how much a fault can slip in a 
single earthquake and the seismic potential of a partially coupled 
megathrust interface”. But it is not clear what physical or structural 
characteristics controlled either the rupture extent or the amplitude 
of slip in this earthquake. Here we use residual topography and 
gravity anomalies to constrain the geological structure of the 
overthrusting (upper) plate offshore northeast Japan. These data 
reveal an abrupt southwest-northeast-striking boundary in upper- 
plate structure, across which gravity modelling indicates a south- 
to-north increase in the density of rocks overlying the megathrust of 
150-200 kilograms per cubic metre. We suggest that this boundary 
represents the offshore continuation of the Median Tectonic 
Line, which onshore juxtaposes geological terranes composed 
of granite batholiths (in the north) and accretionary complexes 
(in the south)?. The megathrust north of the Median Tectonic 
Line is interseismically locked’, has a history of large earthquakes 
(18 with M, > 7 since 1896) and produced peak slip exceeding 
40 metres in the Tohoku-oki earthquake’. In contrast, the 
megathrust south of this boundary has higher rates of interseismic 
creep’, has not generated an earthquake with My > 7 (local 
magnitude estimated by the Japan Meteorological Agency) since 
1923, and experienced relatively minor (if any) co-seismic slip in 
2011!. We propose that the structure and frictional properties of the 
overthrusting plate control megathrust coupling and seismogenic 
behaviour in northeast Japan. 

The seismic moment of megathrust earthquakes is proportional to 
the product of rupture area, the average amount of slip, and the effective 
shear modulus*. Most variability is usually attributed to rupture area 
and so research has focused on the structural and geometrical barri- 
ers that segment plate boundaries and limit the dimensions, and thus 
magnitude, of earthquakes. Previous studies have shown the roughness 
of subducting plates to be one of the first-order controls on rupture 
dimensions~’; but the amplitude of slip is proportional to the rupture 
area and the stress drop®, and may be harder to predict. In particular, 
of the five earthquakes since 1900 that had M,, > 9, the March 2011 
Tohoku-oki earthquake had the smallest rupture area by 50% (that is, its 
rupture area was 0.5 that of the next-smallest earthquake in Kamchatka 
in 1952), but had the largest maximum displacements by 75% (that is, 
its maximum slip was 1.75 times that of the next-largest earthquake in 
Chile in 1960) (ref. 1 and Extended Data Fig. 1). 

Geodetic observations show that most of the elastic strain energy 
released in megathrust earthquakes is accumulated in the overthrust- 
ing plate so mapping along-strike variations in upper-plate structure 
is critical to understanding fault loading. But identifying upper-plate 
structural variations from the topography and gravity fields is difficult 
because the relatively small-amplitude, short-wavelength structure is 


masked by the large-amplitude, trench-normal topography and gravity 
gradients associated with subduction zones. Here the application of 
spectral averaging routines designed to isolate and remove these gra- 
dients®'° provides observations that reveal how the crustal structure 
of the northeast Japan forearc influenced the rupture pattern of the 
Tohoku-oki earthquake. 

Residual topography and gravity anomalies were calculated by 
regionally subtracting a spectral average of the trench-normal topo- 
graphy and gravity anomalies, and are shown in Fig. 1 (see Methods 
and Extended Data Fig. 2 for grid processing). The Pacific oceanic 
plate is subducting beneath Honshu at 80-85 mm yr7! and the sub- 
marine volcanoes of Erimo and the Joban seamounts are observed 
as positive residual topographic anomalies with amplitude >2km 
both seaward and landward of the trench axis. The absence of similar 
anomalies along-strike suggests that the northeast Japan megathrust 
is relatively smooth. Across the forearc, our analysis reveals an abrupt 
and approximately linear transition in structure striking southwest- 
northeast (red dashed line in Fig. 1). Residual topography and gravity 
anomalies regionally increase from south to north across this boundary 
by ~0.8km and ~60 mGal respectively (Fig. 1d). A second across- 
forearc transition of similar bathymetric character is located southeast 
of Hokkaido. The smooth and homogeneous nature of the subducting 
Pacific plate seaward of the trench axis strongly suggests that both 
transitions are related to the structure of the overthrusting plate. 

Extension of the southern Japan forearc boundary to the trench axis 
makes it unlikely that variations in forearc Moho depth contribute to 
the along-strike contrast in gravity anomalies. We thus attribute resid- 
ual gravity anomalies to trench-slope topography and lateral variations 
in the density of rocks comprising the overthrusting plate. The density 
contrasts associated with these variations are estimated by discretiz- 
ing the forearc into 10km x 10km vertical prisms extending between 
the sea floor and the seismically constrained base of the forearc crust 
(Extended Data Figs 3-7), with a single density anomaly calculated for 
each prism (see Methods). The contrast in residual gravity anomalies 
can be explained by a south-to-north increase in the mean density of 
forearc rocks by ~150-200kg m~? (Fig. 1c, d). The inverse propor- 
tionality between density contrasts and crustal thickness makes this 
a minimum estimate and the reduction in crustal thickness near the 
trench axis increases the magnitude of density contrasts (Fig. 1c). This 
analysis shows that the sharp change in trench-slope morphology 
is coincident with a transition in the density of the materials that 
comprise the southern Japan forearc crust. 

The onshore geology’ of Japan provides a physical interpretation of 
this transition (Fig. 2). The Median Tectonic Line (MTL) is the most 
prominent structural boundary and clearly separates two contrasting 
geological terrains. On the continent (north) side, a ~20-km-thick 
granitic upper-crust reflects intrusion of arc melt since the Cambrian 
Period*. In contrast, on the ocean (south) side, the forearc crust is 
entirely composed of variably metamorphosed Late Mesozoic to 
Cenozoic accretionary complexes’. A variety of tectonic models have 
been proposed to explain the unnatural juxtaposition of these terrains’, 
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Figure 1 | Forearc anomalies in the northeast Japan subduction 

zone. a, Residual topography. Dashed red and grey lines mark the forearc 
segment boundary and the trench-axis respectively. Triangles show arc 
volcanoes. Hk, Hokkaido; E, Erimo seamount; T, Tokyo; K, Kashima. 

b, Residual gravity anomalies. The black dashed line marks the intersection 
of the subducting slab with the forearc Moho. c, Mean forearc density 


but magnetotelluric!” and active source seismic!“ data show a shallow 
(35°-45°) north-dipping fault geometry that is most consistent 
with a thrust faulting origin either along a low-angle mid-crustal 
detachment? or the Cretaceous subduction megathrust'*. These data 
further demonstrate that the fundamental contrast in upper-plate struc- 
ture expressed in surface geology across the MTL persists to depths of 
at least 20 km (ref. 13). 

We propose that the MTL extends offshore Kashima, connecting to 
the step in residual gravity and bathymetry. The implication is that the 
abrupt change in forearc structure represents the lithological juxtapo- 
sition of granitic batholiths to the north and accretionary complexes to 
the south. This suggestion is consistent with the offshore extent of pos- 
itive aeromagnetic anomalies, which have been modelled as batholiths 
10-15km thick!>. The trend of aeromagnetic anomalies (red dashed 
line in Fig. 2) correlates the largest-amplitude residual gravity anomaly 


Distance (km) 


-20 0 20 40 
Distance (km) 

anomalies. Contours (5-km increment) show forearc crustal thickness. 

d, Profiles (grey lines) perpendicularly traversing the forearc segment 

boundary showing the south-to-north increases in topography (~0.8 km), 

residual gravity anomaly (~60 mGal) and mean forearc density anomalies 

(~150-200 kg m~*). The mean for each ensemble of profiles is plotted 

in black. 


(labelled KB) with the Early Cretaceous Kitakami batholith, which is 
exposed onshore in the Kitakami mountains (KM) (Fig. 2). The pres- 
ence of low-to-mid-pressure metamorphic rocks within the southern 
region of the Abukuma highland (AH) is consistent with the MTL 
geometry inferred offshore. 

The most interesting and important aspect of this forearc transition 
is that it is highly correlated with the seismogenic behaviour of the 
megathrust, as shown in Fig. 3. First, the forearc segment boundary 
is associated with a sharp north-to-south reduction in the number of 
intermediate- magnitude earthquakes (Fig. 3a). Second, historical rup- 
ture areas of large (M7 > 7.0) megathrust earthquakes between 1896 
and March 2011 (Supplementary Table 2) are similarly focused north 
of the proposed extension of the MTL (Fig. 3b); and the megathrust 
to the south, where the forearc is characterized by negative bathyme- 
tric and gravimetric anomalies, has not generated an earthquake with 
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Figure 2 | Simplified geology’! and major tectonic boundaries’ of 
Japan. Volcanic and plutonic rocks (Cambrian to middle Miocene) are 
shown in black. Accretionary complexes (Jurassic to middle Miocene) 

are shown in grey Metamorphic rocks (Cretaceous to middle Miocene) are 
shown in red and green. Main faults are shown in brown. MTL geometry 
is constrained by the juxtaposition of high-pressure footwall rocks (red 
shading indicates the Sanbagawa and Shimanto accretionary complexes) 
with coeval low-pressure granitic hanging-wall rocks (green shading 
indicates the Ryoke-Sanyo belt). Note the similar transition in peak 
metamorphic pressure observed across the Hidaka main thrust (HMT) 


My > 7.0 in the >90-year duration of the Japan Meteorological Agency 
(http://www.jma.go.jp/en/quake/) earthquake catalogue. Third, there 
is a strong correlation between forearc structure and the distribution 
of co-seismic slip in the 2011 My 9.0 Tohoku-oki earthquake! (Fig. 3c). 
The Tohoku earthquake filled a seismic gap as defined by the rup- 
ture areas of earlier large thrust earthquakes (Fig. 3b) and most of 
the moment release occurred north of the forearc segment boundary 
in regions characterized by positive residual gravity anomalies. This 
correlation appears to be robust, as shown by a comparison with dif- 
ferent published co-seismic slip models of the Tohoku earthquake that 
employed different data types and inversion algorithms (Extended 
Data Fig. 8). Finally, geodetically constrained interseismic deforma- 
tion models all show a high degree of fault locking within the region of 
positive residual gravity’, but creep to the south of the forearc segment 
boundary (Fig. 3d, Supplementary Fig. 2). 

The simplest interpretation of the relationships described above is 
that upper-plate lithology modulates the frictional behaviour of the 
megathrust. Frictional properties of typical crystalline rocks com- 
prising the volcanic arc are characterized by a relatively high static 
coefficient of friction (49 = 0.6-0.8) and strong velocity weaken- 
ing’*!”, Increasing clay content decreases the coefficient of friction 
({49 =0.2-0.4 in gouge with >50 wt% clay)!©'8 and promotes a more 
stable response to perturbations in fault slip rate!®. Materials south of 
the forearc segment boundary are thus expected to be both weaker and 
less velocity weakening. 

Numerical models of earthquake cycles incorporating rate- and 
state-dependent friction show that in the interseismic period, asper- 
ities are loaded at a rate modulated by their strength and location 
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in Hokkaido. KM, Kitakami mountains; KB, Kitakami batholith; AH, 
Abunuka highland; I-STL, Itoigawa-Shizuoka tectonic line. The strike 
of the Kitakami aeromagnetic anomalies is shown as a dashed red line’>. 
The inset compares the density distribution of granitic rocks east of the 
volcanic arc in northeastern Honshu” and the mean dry densities for 
Tertiary and Mesozoic sedimentary rocks (error bars show one standard 
deviation)*’. The modelled density contrast across the forearc segment 
boundary (~150-200 kgm ~*) is within the range expected across 

the MTL. 


relative to other asperities, and the flexural rigidity of the upper 
plate, which controls the extent of stress shadowing!®°. Stress 
increases are maximum on asperities with the highest uo and the 
most negative values of the rate-dependent friction parameter 
a— b, relative to adjacent fault regions (a and b represent the mag- 
nitudes of the direct and evolution effect in friction respectively)! 
Correspondingly, the stress drop and amplitude of co-seismic slip are 
amplified by sharper contrasts in frictional properties at the asperity 
boundaries’. Dynamic rupture simulations also show that dynamic 
rupture fronts decelerate as they penetrate into unloaded, velocity- 
strengthening, or compliant (less rigid) regions, which may ultimately 
arrest co-seismic ruptures”>. 

Under this framework, the anomalous nature of the Tohoku asperity 
may be attributed to the sharp transition in 19 and/or a — b across the 
MTL. The abrupt nature of this transition results in the local develop- 
ment of a large stress concentration in the interseismic period due to 
creep in (presumably) velocity-strengthening regions down-dip and 
south of the asperity. Low seismicity north and south of the asperity 
(Fig. 3a, b) may reflect reduced interseismic stressing rates within the 
Tohoku ‘stress shadow’ Simulations of earthquake cycles on a fault with 
heterogeneous distributions of /19 and a — b further reveal that large 
events can occur within the area that occasionally ruptures in great 
earthquakes (Extended Data Fig. 10). We propose that stress hetero- 
geneities either side of the MTL may explain both the small slip area 
and the large slip amplitude of the Tohoku-oki earthquake rupture. 
(1) The extent of co-seismic slip was limited by the inability of the 
rupture front to penetrate into the low-stress or velocity-strengthening 
fault segment south of the MTL. (2) The highly stressed or strongly 
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Figure 3 | Slip behaviour of the northeast Japan megathrust. Tohoku-oki earthquake (Supplementary Table 3). c, Coseismic slip 
a, Instrumental earthquake record. Grey plus symbols show the epicentres —_ contours (10-m increment) for the 2011 M, 9 Tohoko-oki earthquake. 
of earthquakes in the JMA catalogue (1923-2015) with My > 6.5. The 20-m slip contour (thicker contour line) defines the Tohoku 
Dashed ellipses show the aftershock area of thrust earthquakes in the asperity. d, Inter/postseismic deformation. Contours show interseismic 
Global Centroid Moment Tensor (http://www.globalcmt.org) catalogue back-slip rate (increment 2 cm yr)”. Arrows show 1-year postseismic 
(period 1976-2014) with 6.5 < M,, <8. No aftershock areas cross displacements of seafloor GPS sites**. The fast seaward motion of site 
the forearc segment boundary (dashed red line). b, Rupture areas for FUKU is associated with shallow afterslip**. 


large (Mwy > 7.0) megathrust earthquakes between 1896 and the 2011 


velocity- weakening area with high rates of tectonic loading north ofthe _ stress may still be important in influencing the lateral distribution of 
MTL resulted in the large slip amplitude’. This interpretation implies _ plate locking at a given depth. 
accelerated postseismic afterslip within the ‘stress shadow areas of the Two additional insights from this study may be important for 
Tohoku asperity. Robust afterslip has indeed been inferred in that area understanding seismic hazard in other subduction zones. The first 
from the fast seaward motion of the seafloor Global Positioning System comes from the observation that the accreted terranes that appear 
(GPS) site FUKU” (Fig. 3d and Extended Data Fig. 9). The seaward _ to be creeping south of the forearc segment boundary’ also overlie 
motion of this site is contrary to the postseismic landward motions _ the Nankai megathrust in southwest Japan, which is characterized 
observed at seafloor GPS sites within the main rupture area, andis _ by interseismic locking”® and produced large earthquakes in 1944 
probably caused by substantial afterslip south of the forearc segment (My 8.1), 1946 (M,, 8.3) and 1968 (M,, 7.5)?’. On one hand, this may 
boundary”. Regional seafloor geodetic measurements froma cabled _ reflect differences in the relative proportion of subducting and over- 
seismological and geodetic observatory” will place important con- _ thrusting plate materials within the megathrust shear zone, and the 
straints on the extent of post-seismic creep and moment accumulation strong dependence of the a — b value of clay-rich sediments on lith- 
rates, and provide further insights into tectonic loading of the Tohoku _ologic heterogeneity”®. On the other hand, the amplitude of stress 
asperity, fault-slip behaviour south of the MTL and, ultimately, seismic heterogeneities is dependent on spatial gradients in jug and a — b. 
hazard in Kashima and eastern Tokyo. The apparent low seismicity north and south of the Tohoku asperity 
The correlation between the forearc segment boundary and mega- _ might be due to ‘stress shadowing, resulting in interseismic stressing 
thrust slip behaviour appears to persist west of the intersection rates on the megathrust even lower than those implied by models of 
between the slab with the forearc Moho. This may reflect subduction _interseismic coupling (Fig. 3d). In Nankai, the composition of the 
of eroded upper-plate materials beyond the forearc Moho, thereby _ overthrusting plate is more homogeneous along-strike, which may 
extending the influence of upper-plate lithology on the frictional prop- _ result in smaller variations in fault frictional properties, and more 
erties of the megathrust to greater depth. The obliquity of the MTL uniform interseismic loading rates. These factors are likely to encour- 
may cause subduction of eroded materials to reduce the along-strike age earthquake ruptures in Nankai to extend over larger areas. The 
gradient in frictional properties, but both the location and magni- second insight is that low residual gravity anomalies are not a good 
tude of this effect will depend on the depth range of tectonic ero- indicator of seismogenic behaviour, as proposed in previous stud- 
sion. Finally, the contrast in mean forearc density and trench-slope _ies”””?. The residual topography and gravity fields are now available 
topography across the MTL will result in an along-strike variation in for all subduction zones on Earth", but their real utility comes from 
lithostatic pressure of 40-60 MPa. Although considerably lower than _ enabling existing and future seismological and geodetic observations 
dip-parallel variations in lithostatic stress associated with slab-dip and __ to be considered in the context of subducting and overthrusting plate 
trench-slope topography, dip-parallel lithostatic gradients occur both _ structure. This study shows that these considerations are an essential 
north and south of the MTL and the along-strike contrast in lithostatic | component of hazard assessment. 


3 MARCH 2016 | VOL 531 | NATURE | 95 
© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 


Received 26 June; accepted 11 December 2015. 


1. Minson, S. et al. Bayesian inversion for finite fault earthquake source 
models—ll: the 2011 great Tohoku-oki, Japan earthquake. Geophys. J. Int. 
198, 922-940 (2014). 

2. Suwa, Y., Miura, S., Hasegawa, A., Sato, T. & Tachibana, K. Interplate coupling 
beneath NE Japan inferred from three-dimensional displacement field. 
J. Geophys. Res. 111, BO4402 (2006). 

3. Isozaki, Y., Aoki, K., Nakama, T. & Yanai, S. New insight into a subduction-related 
orogen: a reappraisal of the geotectonic framework and evolution of the 
Japanese Islands. Gondwana Res. 18, 82-105 (2010). 

4. Aki, K. Estimation of earthquake moment, released energy, and stress-strain 
drop from G-wave spectrum. Bull. Earthq. Res. Inst. 44, 23-88 (1966). 

5. Wang, K. & Bilek, S. L. Fault creep caused by subduction of rough seafloor 
relief. Tectonophysics 610, 1-24 (2014). 

6. Robinson, D. P, Das, S. & Watts, A. B. Earthquake rupture stalled by a 
subducting fracture zone. Science 312, 1203-1205 (2006). 

7. Sparkes, R., Tilmann, F., Hovius, N. & Hillier, J. Subducted seafloor relief stops 
rupture in South American great earthquakes: implications for rupture 
behaviour in the 2010 Maule, Chile earthquake. Earth Planet. Sci. Lett. 298, 
89-94 (2010). 

8. Knopoff, L. Energy release in earthquakes. Geophys. J. Int. 1, 44-52 (1958). 

9. Bassett, D. & Watts, A. B. Gravity anomalies, crustal structure, and seismicity at 
subduction zones: 1. Seafloor roughness and subducting relief. Geochem. 
Geophys. Geosyst. 16, 1508-1540 (2015). 

10. Bassett, D. & Watts, A. B. Gravity anomalies, crustal structure, and seismicity 
at subduction zones: 2. Interrelationships between fore-arc structure 
and seismogenic behavior. Geochem. Geophys. Geosyst. 16, 1541-1576 
(2015). 

11. Seamless Digital Geological Map of Japan 1: 200,000. Jul 3, 2012 version, 
Research Information Database DBO84 (https://gbank.gsj.jp/seamless/ 
index_en.html?), Geological Survey of Japan, National Institute of Advanced 
Industrial Science and Technology (2012). 

12. Goto, T., Yamaguchi, S., Sumitomo, N. & Yaskawa, K. The electrical structure 
across the Median Tectonic Line in east Shikoku, southwest Japan. 

Earth Planets Space 50, 405-415 (1998). 

13. Kawamura, T., Onishi, M., Kurashimo, E., |kawa, T. & Ito, T. Deep seismic 
reflection experiment using a dense receiver and sparse shot technique for 
imaging the deep structure of the Median Tectonic Line (MTL) in east 
Shikoku, Japan. Earth Planets Space 55, 549-557 (2003). 

14. Sato, H., Kato, N., Abe, S., Van Horne, A. & Takeda, T. Reactivation of an old plate 
interface as a strike-slip fault in a slip-partitioned system: Median Tectonic 

Line, SW Japan. Tectonophysics 644, 58-67 (2015). 

15. Finn, C. Aeromagnetic evidence for a buried Early Cretaceous magmatic arc, 

northeast Japan. J. Geophys. Res. 99, 22165-22185 (1994 

16. Morrow, C., Moore, D. E. & Lockner, D. The effect of mineral bond strength and 

adsorbed water on fault gouge frictional strength. Geophys. Res. Lett. 27, 

815-818 (2000). 

17. Mitchell, E., Fialko, Y. & Brown, K. Temperature dependence of frictional healing 

of Westerly granite: experimental observations and numerical simulations. 

Geochem. Geophys. Geosyst. 14, 567-582 (2013). 

18. Numelin, T., Marone, C. & Kirby, E. Frictional properties of natural fault gouge 

rom a low-angle normal fault, Panamint Valley, California. Tectonics 26 (2), 
1-14 (2007). 

19. Kanamori, H. The nature of seismicity patterns before large earthquakes. 
Earthquake Prediction 4, 1-19 (1981). 


= 


96 | NATURE | VOL 531 | 3 MARCH 2016 


20. Hetland, E. & Simons, M. Post-seismic and interseismic fault creep. II: Transient 
creep and interseismic stress shadows on megathrusts. Geophys. J. Int. 181, 
99-112 (2010). 

21. Kaneko, Y., Avouac, J.-P. & Lapusta, N. Towards inferring earthquake patterns 
from geodetic observations of interseismic coupling. Nature Geosci. 3, 
363-369 (2010). 

22. Hillers, G. & Wesnousky, S. Scaling relations of strike-slip earthquakes with 
different slip-rate-dependent properties at depth. Bull. Seismol. Soc. Am. 

98, 1085-1101 (2008). 

23. Tinti, E., Bizzarri, A. & Cocco, M. Modeling the dynamic rupture propagation on 
heterogeneous faults with rate-and state-dependent friction. Ann. Geophys. 48 
(2), 327-345 (2005). 

24. Sun, T. & Wang, K. Viscoelastic relaxation following subduction earthquakes 
and its effects on afterslip determination. J. Geophys. Res. 120, 1329-1344 
(2015). 

25. Kanazawa, T. in Underwater Technology Symposium IEEE International 1-5, 
http://dx.doi.org/10.1109/UT.2013.6519911 (IEEE, 2013). 

26. Wallace, L. M. et al. Enigmatic, highly active left-lateral shear zone in 
southwest Japan explained by aseismic ridge collision. Geology 37, 143-146 
(2009). 

27. Wells, R. E., Blakely, R. J., Sugiyama, Y., Scholl, D. W. & Dinterman, P. A. 
Basin-centered asperities in great subduction zone earthquakes: a link 
between slip, subsidence, and subduction erosion? J. Geophys. Res. 108 (B10), 
1-30 (2003). 

28. den Hartog, S., Niemeijer, A. & Spiers, C. New constraints on megathrust slip 
stability under subduction zone P-T conditions. Earth Planet. Sci. Lett. 
353/354, 240-252 (2012). 

29. Song, T. & Simons, M. Large trench-parallel gravity variations predict 
seismogenic behavior in subduction zones. Science 301, 630-633 
(2003). 

30. Okuma, S. & Kanaya, H. Petrophysical Data Base of Basement Rocks in Japan 
for the 21st Century (PB-Rock 21). RIO-DB (Research Information Data Base) 
87, http://riodb02. ibase.aist.go.jp/pb-rock21/index.html (Geological Survey of 
Japan, AIST, 2011). 

31. Murata, Y., Suda, Y., Kikuchi, T. & Chésajo, C. Rock Physical Properties of Japan: 
Density, Magnetism, P-Wave Velocity, Porosity, Thermal Conductivity (Geological 
Survey of Japan, 1991). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank S. Naif, C. Davies, C. Twardzik and S. Das for 
suggestions. Figures and grid processing was conducted using the Generic 
Mapping Tools (GMT) (http://gmt.soest.hawaii.edu/). D.B. was supported 

by a University of Oxford Clarendon Scholarship, by a Green Foundation 
Postdoctoral Fellowship in the Institute of Geophysics and Planetary Physics, 
Scripps Institution of Oceanography, University of California, San Diego, by the 
National Geospatial Agency (HMO1771310008), and by the Scripps Seafloor 
Electromagnetics Consortium (http://marineemlab.ucsd.edu/semc.html). 


Author Contributions D.B. and A.B.W. conceived the study and conducted 
grid processing. D.B and D.T.S. calculated density anomalies. Y.F. conducted 
numerical simulations of earthquake cycles. D.B., D.T.S. and Y.F. wrote the 
initial manuscript. All authors discussed the results and commented on the 
manuscript. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial 
interests. Readers are welcome to comment on the online version of the 
paper. Correspondence and requests for materials should be addressed to 
D.B. (dbassett@ucsd.edu). 


© 2016 Macmillan Publishers Limited. All rights reserved 


METHODS 


Spectral averaging of topography and gravity grids. Residual anomalies are cal- 
culated using the satellite-derived free-air gravity anomaly grid of Sandwell et al.” 
and the GEBCO shipboard bathymetry grid**. Free-air gravity anomalies onshore 
are reduced to Bouguer gravity anomalies using a crustal density of 2,670kgm ~*. 
Bathymetry and gravity grids are sampled by 1,200-km-long trench-normal pro- 
files, which are centred on the bathymetrically defined trench-axis and spaced 
~25km along strike. The mean cross-sectional structure of the subduction zone 
is calculated as the spectral average across each ensemble of profiles. The linearity 
of the fast Fourier transform and its inverse means that spectral averages should 
be indistinguishable from arithmetic averages, but small differences (<10m and 
<2 mGal) are present, resulting from the finite precision of the fast Fourier trans- 
form and associated down-sampling’. Maintaining the geometry of the trench 
axis, average profiles are extended along-strike to produce grids of each ensemble 
average profile, which is then subtracted from the original data set to produce grids 
of residual topography and residual gravity anomalies. In contrast to profile-based 
residual calculations, subtraction of an ensemble average grid preserves the full 
1 min x 1 min resolution of the original data sets. This processing methodology 
has been applied globally” and is illustrated in Extended Data Fig. 2. 
Calculation of density anomalies. Using the average topographic profile and 
active-source seismic constraints on across-arc crustal structure (Supplementary 
Table 2 and Extended Data Figs 4 and 5), the ensemble average gravity anom- 
aly profile can be well fitted using realistic density distributions for the crust 
(~2,800 kg m~*) and mantle (~3,100 kg m~*) (Extended Data Fig. 3). This shows 
that the ensemble average gravity anomaly captures the broad crustal architecture 
of the subduction zone, enabling short-wavelength residual gravity anomalies 
landward of the trench axis to be interpreted as reflecting crustal structure of the 
forearc, arc and backarc. 

The forearc segment boundary extends to the trench axis with no change in 
amplitude across the intersection of the subducting Pacific plate with the forearc 
Mohorovic boundary (Moho) (Fig. 1b). It is thus unlikely (and nearer the trench 
impossible) that variations in forearc Moho depth contribute to the along-strike 
contrast in gravity anomalies. Hence, we attribute residual gravity anomalies 
to trench-slope topography and lateral changes in density of the overthrusting 
forearc crust. The magnitude of these changes are calculated by discretizing 
the forearc into 10km x 10km vertical prisms. The top of each prism is 
constrained by the sea floor and the base is constrained by either the forearc 
Moho or the top of the subducting slab (whichever is shallower). The geometry 
of both interfaces is constrained by active-source wide-angle seismic models 
(Extended Data Figs 4 and 5 and Supplementary Table 2). The initial density 
contrast Ap for each prism is calculated directly from the amplitude of residual 
gravity anomalies Reray: 


Rgray 


=e 2uGh 

where h is the crustal thickness of the forearc and G is the gravitational constant. 
Synthetic gravity anomalies are calculated using the Fatiando:Gravmag module™, 
which calculates the gravity effect of three-dimensional rectangular prisms using 
the formula of Plouff*°. Outstanding residual gravity anomalies are used to update 
Ap within each prism. After 12 iterations the root-mean-square misfit between the 
observed and calculated residual gravity anomalies is <1 mGal (inset to Extended 
Data Fig. 6b). The magnitude of density contrasts is inversely proportional to the 
vertical extent of the causative prism. Our methodology thus yields minimum 
estimates with density contrasts distributed throughout the full thickness of the 
forearc crust, h. As shown by the contours in Fig. 1c, the larger density contrasts 
near the trench axis predominantly reflect the reduction in forearc crustal thick- 
ness (Extended Data Fig. 3c) although they may also reflect, in part, contrasting 
resistances to near-trench deformation. The distribution of density anomalies is not 
strongly dependent on the observations used to constrain forearc crustal thickness 
and experimentation between passive (SLAB1.0* and earthquake tomographic*”) 
versus active-source seismic (Supplementary Table 2) constraints on subduct- 
ing slab and forearc Moho geometries yield differences in density anomalies of 
<10kgm ? (Extended Data Fig. 7). These differences are an order of magnitude 
smaller than the ~150kgm ~° contrast in mean forearc density calculated across 
the forearc segment boundary (Fig. 1c-d). 

Comparison with earthquake distributions. Forearc structure is compared with 
the slip behaviour of the megathrust using published slip models and aftershock 
distributions for earthquakes with M,, >7 occurring since 1896. This catalogue is 
presented as Supplementary Table 3. For earthquakes with 6.5 < My <7.5 between 
1976 and 2015 (duration of instrumental records), rupture areas are estimated from 
the distribution of aftershocks in the International Seismological Center (ISC) 
(http://www.isc.ac.uk/) earthquake catalogue. Global analyses have shown that 
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there is little change in aftershock areas after the first week*, and we estimate 
rupture areas from one-week aftershock distributions. 

Tajima et al.*° reviewed 44 published slip distributions for the Tohoku-oki earth- 
quake. Most show the maximum slip to be >40 m. Following the asperity definition 
applied in northeast Japan by Yamanaka and Kikuchi’, we define the Tohoku 
asperity as regions where the co-seismic slip was >20 m and half the maximum 
slip amplitude (thickened contour Fig. 3c). 
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Extended Data Figure 1 | Rupture areas and co-seismic slip amplitudes 
in Earth’s largest earthquakes!*!~*8, Grey bars and black dots plot the 
mean and maximum amounts of co-seismic slip against the area of fault 
rupture. The five earthquakes with M,, > 9 are labelled and numbers 6-10 
refer to the location of smaller-magnitude earthquakes in the catalogue 
shown in Supplementary Table 1. Note the anomalously large (>70 m) 
amount of co-seismic slip in the 2011 Tohoku-oki event!. The black arrow 
shows the maximum amount of slip in the 1906 San Francisco earthquake. 
The rupture area in this strike-slip event was 6 x 10° km? and two orders of 
magnitude smaller than the megathrust rupture areas plotted here. 
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Extended Data Figure 2 | Grid processing methodology. Panels illustrate | each ensemble of profiles (shown as insets to ¢ and d). Maintaining the 


the ensemble-averaging and grid-processing methodology as applied at geometry of the trench, grids of the average profile are constructed (c and d), 
the northeast Japan subduction zone. Regional grids of bathymetry** and and subtracted from the original data sets to reveal residual bathymetry (e) 
free-air/Bouguer-corrected (FA/BC) gravity anomaly** are sampled along and residual gravity anomalies (f). This technique of spectral averaging is 
trench-normal profiles (a and b). The spectral average is calculated from identical to that applied globally”. 
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Extended Data Figure 3 | Ensemble average profiles and two- 
dimensional gravity model. a, Ensemble average topographic profile. 

b, Ensemble average gravimetric profile (black). The red profile 

shows the gravity anomaly calculated for the two-dimensional density 
structure shown in d. c, Mean forearc density (left) and forearc crustal 
thickness (right) plotted against distance from the trench axis. The larger 
amplitudes of density anomalies near the trench predominantly reflects 
the reduction in forearc crustal thickness h although they may also reflect, 
in part, contrasting resistances to near-trench deformation. d, Model of 
crustal structure for the northeast Japan subduction zone. This model is 
constructed using the mean geometry of the trench slope (shown in a) 


and subducting slab, seismic constraints on forearc and subducting 

slab (~7 km) crustal thicknesses (Supplementary Table 2), and using 
reasonable values for crustal (~2,800 kg m~) and mantle densities 
(~3,100 kg m~*). The good fit observed in b between the ensemble 
average (black) and calculated (red) gravity anomalies shows that the 
ensemble average gravity anomaly captures the broad crustal architecture 
of the subduction zone, which enables the residual anomalies revealed 
following the removal of this average to be interpreted. The short- 
wavelength nature of residuals in the northeast Japan forearc suggests that 
most are related to crustal structure. 
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profiles shown in a. The dotted line marks the intersection of the 
subducting slab with the forearc Moho. Crustal thickness is calculated by 
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Extended Data Figure 4 | Subducting slab geometry and forearc crustal 

thickness. a, Geometry of the subducting Pacific Plate as constrained 

by linearly interpolating along-strike between active-source wide-angle subtracting the observed bathymetry from the seismically constrained 
seismic profiles?*°. Profiles are numbered as listed in Supplementary base of the forearc crust. c, As in b, but with forearc Moho depth 
Table 2 and plotted in Extended Data Fig. 5. Red triangles show arc constrained by the tomographic model of Katsumata*”. 
volcanoes. b, Forearc crustal thickness as constrained by the wide-angle 
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Extended Data Figure 5 | Wide-angle seismic models. Profiles are 
numbered as in Extended Data Fig. 4a and Supplementary Table 2“°~°°. 
See legend for figure nomenclature. Dots show slab and Moho positions 
at profile intersections. Horizontal axes show model kilometres. The 
slab and Moho geometries shown for profiles 5 and 6 are from reflector 
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distributions imaged by travel-time mapping™. Intersecting profile 3 
suggests that Moho reflectors interpreted south of 120 km on profile 6 may 
originate from the top of the subducting Pacific plate and forearc Moho 
constraints are only incorporated north of model kilometre 150. 
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Extended Data Figure 6 | Calculation of forearc density anomalies. 
a, Observed residual gravity anomalies. Black contours show forearc 
crustal thickness (5 km increment). b, Synthetic gravity anomalies 
calculated from the distribution of forearc density anomalies shown in c. 
c, Distribution of density anomalies. Density contrasts are constant 
within 10 km x 10 km vertical prisms extending between the seabed and 
either the top of the subducting slab or the forearc Moho (whichever is 
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shallower). Initial density contrasts for each prism are estimated directly 
from residual gravity anomalies using the known thicknesses of each 
prism, with the difference between observed and synthetic residual gravity 
anomalies similarly applied to update model parameters. The inset to b 
shows the reduction in root-mean-square misfit with each update of model 
parameters. After 12 iterations the root-mean-square misfit <1 mGal. 

d, Difference between observed and synthetic residual gravity anomalies. 
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Extended Data Figure 7 | Density anomalies calculated using different seismic models. The difference in density models calculated using these 
constraints on forearc crustal thickness. a, Density anomalies calculated different model parameterizations are of the order of 10-20 kg m~*. All 
using active-source seismic constraints on the slab and forearc Moho panels show a clear north-to-south reduction in density anomalies across 
(Extended Data Fig. 4a and b). b, Density anomalies calculated using the forearc segment boundary (red dashed line), and our interpretation 
active-source seismic constraints on the geometry of the subducting of this contrast is not dependent on the observations used to constrain 
Pacific Plate, but using the tomographic model of Katsumata®’ to constrain _forearc crustal thickness. A comparison between forearc structure inferred 
the forearc Moho (Extended Data Fig. 4c). c, Density anomalies calculated _ from residual gravity anomalies and the seismic velocity structure of the 
using SLAB1.0°° for the subducting Pacific Plate and assuming a planar forearc*” is shown in Supplementary Fig. 1. 
forearc Moho at the mean depth (25 km) determined by active-source 
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Extended Data Figure 8 | Co-seismic slip models for the March g, Ozawa et al.® (allowing slip at trench); h, Ozawa et al.°* (imposing no- 
2011 Tohoku-oki earthquake. Plots showing the correlation between slip condition at trench); i, Fujii et al.**. In all plots, large co-seismic slip is 
overthrusting plate structure as constrained by residual gravity anomalies focused north of the forearc segment boundary in regions characterized 
and the distribution of slip in the Tohoku-oki earthquake. These models by positive residual gravity anomalies. Most models also show a sharp 
have been constructed using different data types and inversion strategies. reduction in the magnitude of slip from north to south across the forearc 
In all plots, grey and red dashed lines mark the trench axis and the segment boundary. The wide range of data types and inversion strategies 
forearc segment boundary respectively. Contour intervals are labelled represented by this ensemble of models suggests that the common features 
and the outermost contour is 0. a, Minson et al.'; b, Simons, et al.°8; identified above are probably robust characteristics of the Tohoku-oki 


c, Ammon et al.*°; d, Yue and Lay®; e, Melgar and Bock*; f, Sato et al.©; earthquake rupture. 
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Extended Data Figure 9 | Postseismic observations. a, Aftershocks 
between March 11 and May 24 occurring on the subduction interface®. 
All plots show the trench axis (grey dashed line), forearc segment 
boundary (red dashed line) and contours (10 m) of co-seismic slip!. b, All 
aftershocks (variable location/mechanism) occurring within seven months 
of the Tohoku mainshock with My > 5 (ref. 63). Panels a and b show that 
interplate aftershocks for the Tohoko-oki earthquake did not occur in 
areas that experienced large co-seismic displacements. The vast majority 
occur in regions surrounding the mainshock rupture area, the distribution 
of which supports the Bayesian slip distribution of ref. 1, and provides a 
useful constraint on the along-strike extent of co-seismic rupture. The 
negative correlation between co-seismic slip and aftershock locations is 
strongest for the aftershock locations of ref. 65, because they have a lower 
magnitude cut-off (and hence more events) and because they evaluate 
Kagans angles to isolate interplate aftershocks from those occurring within 
either the subducting or overthrusting crust, both of which show no 
correlation with the co-seismic rupture area (see figure 3b and c of ref. 65). 


©@O@ TD S99 Aftershocks Mw > 5 (7 months) 
7 8 


Model ——> 


0.5m 
GPS => 


Ozawa et al., JGR 2012 Sun and Wang., JGR 2015 


These aftershock locations and the distributed slip models shown in 
Extended Data Fig. 8 suggest that large co-seismic displacements (>20 m) 
in the Tohoku-oki earthquake did not continue >50 km southeast of the 
MTL. ¢, Afterslip. Red arrows show one-year postseismic displacements 

of seafloor GPS sites. Blue arrows show predicted GPS vectors from the 
viscoelastic model of Sun and Wang”. Thick grey contours (numbers are 
in metres) show the distribution of afterslip**. In the dip direction, shallow 
afterslip is constrained to occur predominantly seaward of site FUKU and 
thus south of the forearc segment boundary. In the along-strike direction, 
the northern termination of the afterslip patch is expected to be south of 
the main rupture area. To the south, the afterslip may extend much farther 
than depicted by the slip patch shown in Extended Data Fig. 9c and may 
extend as far south as the Joban seamount chain‘. The occurrence of rapid 
afterslip is exactly what would be expected if the region southeast of the 
forearc segment boundary displayed rate-strengthening behaviour during 
the Tohoku-oki earthquake. 
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Extended Data Figure 10 | Numerical model of earthquake cycles displacement of 10 mm. Unlike the case of a single velocity-weakening 

on a fault obeying rate-state friction in the presence of spatially asperity that evolves to a sequence of characteristic earthquakes, the 
heterogeneous frictional properties. a, Assumed distribution of the modelled earthquake sequence reveals a rich complexity and resembles 
rate-dependence parameter a — b. b, Assumed distribution of the coefficient many features of seismicity in the Tohoku area. There are a number of 

of friction. c, Evolution of fault slip in space and time. Black lines denote sub-events of variable size that nucleate predominantly at the boundaries 
interseismic fault slip every 5 years, and red dashed lines denote coseismic _ of high-strength asperities, but are arrested before they grow into system- 
slip with a time interval of 2 s. To illustrate the effects of spatial variations size earthquakes. These sub-events may be analogous to large (M,, 7) 

in the coefficient of friction and the rate-dependence parameter a — b earthquakes that occurred in the Tohoku area before and after the great 
on the patterns of seismicity, we performed simulations of earthquake 2011 earthquake. Occasionally, the entire area breaks in a mega-event (for 
cycles of a fault governed by rate-state friction. We assumed a relatively example, between 60 m and 70 m of cumulative slip). The slip magnitude 
simple case of two asperities (high stress, strong velocity-weakening in these events is determined by the prior slip history and pre-stress. The 
fault sections) separated by a weak (low coefficient of friction, weak model also predicts episodic creep in the middle of the velocity-weakening 
velocity-weakening fault section. The computational domain is 100 km patch (for example, at 38 m of cumulative slip between 45 km and 60 km) 
long, and the characteristic size of asperities is 20 km. Simulations were that may be relevant to inferences of low seismic coupling of certain parts 
performed using a boundary integral method®*’. The fault is driven at a of the megathrust. Additional complexity in slip behaviour is likely to 
fault distance of 100 km by prescribing a constant velocity of 10Ommyr~!. _ be introduced by variations in frictional properties at different spatial 


We assumed a constant normal stress of 50 MPa, and slip-weakening wavelengths, and in both along-strike and down-dip directions. 
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Evidence from cyclostomes for complex 
regionalization of the ancestral vertebrate brain 


Fumiaki Sugahara!’, Juan Pascual-Anaya!, Yasuhiro Oisi*, Shigehiro Kuraku‘, Shin-ichi Aota!, Noritaka Adachi, 
Wataru Takagi!, Tamami Hirai!, Noboru Sato®, Yasunori Murakami’ & Shigeru Kuratani! 


The vertebrate brain is highly complex, but its evolutionary origin 
remains elusive. Because of the absence of certain developmental 
domains generally marked by the expression of regulatory genes, 
the embryonic brain of the lamprey, a jawless vertebrate, had 
been regarded as representing a less complex, ancestral state of 
the vertebrate brain. Specifically, the absence of a Hedgehog- and 
Nkx2.1-positive domain in the lamprey subpallium was thought to 
be similar to mouse mutants in which the suppression of Nkx2-1 
leads to a loss of the medial ganglionic eminence’”. Here we show 
that the brain of the inshore hagfish (Eptatretus burgeri), another 
cyclostome group, develops domains equivalent to the medial 
ganglionic eminence and rhombic lip, resembling the gnathostome 
brain. Moreover, further investigation of lamprey larvae revealed 
that these domains are also present, ruling out the possibility of 
convergent evolution between hagfish and gnathostomes. Thus, 
brain regionalization as seen in crown gnathostomes is not an 
evolutionary innovation of this group, but dates back to the latest 
vertebrate ancestor before the divergence of cyclostomes and 
gnathostomes more than 500 million years ago. 

During development, the vertebrate brain exhibits highly organized 
regionalization along the dorsoventral and anteroposterior axes**; the 
entire neural tube is divided dorsoventrally into basal and alar plates, 
medially united by the roof plate dorsally, and by the floor plate ven- 
trally. Anteroposteriorly, the brain primordium is divided into a series 
of ‘neuromeres, the boundaries of which serve as scaffolds for axonal 
growth, leading to conserved patterns of nerve tracts (Fig. 1a, b and 
Extended Data Fig. 1). Each domain is determined through specific and 
regionalized gene expression to differentiate into a functionally special- 
ized region. This brain regionalization, with distinct domains marked 
by differential, specific gene expression patterns (or genoarchitecture’), 
is surprisingly conserved in the gnathostome embryos examined so far 
(Fig. 1a, b), showing that the basic developmental plan of this organ 
has a deep evolutionary origin and continues to serve as a blueprint 
for brain development across the gnathostomes. However, it remains 
unclear when this developmental programme emerged in evolution. 

Previous studies on the brain of the lamprey, one of the two groups of 
cyclostomes together with the hagfish°, have suggested that this might 
represent a pre-gnathostome state of brain evolution, mainly because of 
the absence of two key domains: (1) the pallidum, which is involved in 
motor control and develops from a Nkx2. 1-positive region within the 
subpallium in gnathostomes, the medial ganglionic eminence (MGE)* 
(Fig. la, c); and (2) a morphologically distinct cerebellum, which has 
crucial roles in sensory integration and motor planning, as well as 
complex cognitive processing in gnathostomes® (Fig. la, c). The 
absence of such structures in the lamprey has placed the origin of the 
complete vertebrate brain architecture, with an MGE and cerebellum, 
at the base of the gnathostome lineage’. However, GABAergic 
(-aminobutyric-acid-releasing) interneurons, which in gnathostomes 


migrate from the embryonic MGE and pallidum-like domains, have 
been observed in juvenile and adult lampreys*’° (Fig. 1d), raising 
the possibility that the embryonic brain of the lamprey represents a 
secondary reduction from the ancestral one’. 

To address this question, we studied a member of a different group 
of cyclostomes, the inshore hagfish (E. burgeri). The adult brain of 
the hagfish has a greatly reduced brain ventricle and exhibits highly 
differentiated organization of the pallium, which makes it difficult to 
compare with that of other vertebrates!! (Extended Data Fig. 2a, b). 


—— tel-di boundary 
—— alar-basal boundary 
th pt 


Figure 1 | Regionalization of the vertebrate brains. a, Schematic drawing 
of the embryonic gnathostome brain based on a mouse embryonic 

day (E)12.5 embryo. Inset shows a transverse section at the level of the 
hindbrain showing the position of the rhombic lip. b, Nerve tracts and 
commissures in the gnathostome embryonic brain based on the shark 
embryo’” (Extended Data Fig. 1). c, Embryonic lamprey brain as revealed 
by previous studies’. d, Adult brain of the lamprey, based on refs 10 and 20. 
Red dotted lines indicate the alar plate-basal plate boundary (sulcus 
limitans). Blue lines delimit the telencephalon-diencephalon (tel-di) 
boundary. Orange circles in b indicate the position of cell bodies of 

the early nerve tracts’’. Coloured regions in c correspond to the same 
colours in d. Note that although the pallidum equivalent region (‘p’) 

was suggested to be present in the adult lamprey brain, its embryonic 
primordium (MGE: Nkx2. 1-positive region) has not been reported so far. 
ac, anterior commissure; cc, cerebellar commissure; ce, cerebellum; ch, 
optic chiasm; h, hindbrain; hc, habenular commissure; hy, hypothalamus; 
m, midbrain; LGE, lateral ganglionic eminence; mlf, medial longitudinal 
fascicle; p, prospective pallidum; pal, pallium; pc, posterior commissure; 
poc, postoptic commissure; pt, pretectum; pth, prethalamus; r1-7, 
rhombomeres 1-7; rl, rhombic lip; sot, supraoptic tract; str, striatum; tel, 
telencephalon; th, thalamus; thc, tract of the habenular commissure; tpc, 
tract of the posterior commissure; tpoc, tract of the postoptic commissure; 
Ay, fourth ventricle. 
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Moreover, as is the case for the lamprey, neither a pallidum nor a cere- 
bellum has been identified in the hagfish!*!*. The identification of 
embryonic domains in the hagfish brain has also been difficult, mainly 
because of the distortion apparent in fixed specimens’. To over- 
come these problems, we obtained embryos of E. burgeri at Bashford 
Dean stages 45 and 53 (refs 17 and 18; Extended Data Fig. 2c-h), 
and compared brain development with lampreys, as well as with the 
cloudy catshark (Scyliorhinus torazame), as a gnathostome outgroup 
to cyclostomes. 

Observations of the developing hagfish brain (Fig. 2 and Extended 
Data Fig. 3) prompted us to consider putative homologies of hagfish 
brain regionalization with those of other vertebrates. At stage 53, the 
hagfish brain clearly shows well-differentiated domains with conspic- 
uous ventricles similar to those in gnathostomes (Fig. 2h), which we 
describe here according to the prosomeric model’. In gnathostomes and 
the lamprey, transverse commissures serve as landmarks for the iden- 
tification of neuromeres'’?”° (Fig. la—c). Longitudinal nerve tracts, 
such as the medial longitudinal fasciculus and some commissures— 
including the optic chiasm and the habenular and posterior commis- 
sures—allowed us to mark the regionalization of the hagfish embryonic 
brain. In this, the habenular commissure defines the roof of the thal- 
amus, or prosomere 2 (p2)%, and the posterior commissure marks 
the posterior boundary of the pretectum (p1)*"! (Figs 1 and 3a, h). 
The ventral limit of the telencephalon is defined as dorsal to the optic 
chiasm (Fig. 3a). The interbulbar commissure, already described in 
the lamprey"', is also present in the hagfish (Fig. 3a, h). The latter 
commissure is equivalent to the gnathostome pallial or hippocampal 
commissures in the telencephalon, and crosses secondary olfactory 
fibres as in the lamprey". Finally, the commissure lying posterior to the 
midbrain-hindbrain boundary of the hagfish brain corresponds to the 
commissura vestibulo-lateralis, which consists of the eighth nerve and 
lateral line nerve fibres as in the lamprey, although in the hagfish the 
cerebellar commissure has not been identified® (Fig. 3a). Throughout 
the developmental stages examined, we could not identify any overt 
epiphysis in the hagfish, although this has been described in the 
lamprey and jawed vertebrates (Extended Data Fig. 4a-g). 

To test for the presence or absence of neuromeres, we investigated 
the brain genoarchitecture of the hagfish embryo to evaluate its region- 
alization. The telencephalon of the hagfish embryo could be identi- 
fied as a domain expressing FoxG1, EmxB and Pax6 orthologues*>”” 
(Fig. 3c-e, j-]l and Extended Data Figs 2k—m, r-t and 5). The hypo- 
thalamus was located rostral to the diencephalon, identified by the co- 
expression of one hagfish Hedgehog gene, Hh2 (one of the three hagfish 
Hh genes newly identified in this study, named Hh2-4, and distinct 
from our previously reported hagfish Hh1 (ref. 18); see Methods and 
Extended Data Figs 5 and 6), and Nkx2. 1/2.4, an orthologous gene of 
both gnathostome Nkx2.1 and Nkx2.4 genes (Fig. 3f, g and Extended 
Data Figs 2n, o and 5). However, we could not determine whether the 
anterior end of the floor plate was hypothalamic or not”’. As in the 
lamprey”, hagfish Hh2 expression also clearly marked the position of 
the zona limitans intrathalamica (ZLI) between the prethalamus (p3) 
and thalamus (p2). This position had not been identified previously 
(Extended Data Fig. 4h-k). Given the Shh-dependent inductive activity 
of the ZLI4, its presence is consistent with the p2—p3 boundary in the 
hagfish diencephalon as noted earlier. More caudally, the pretectum 
(p1), which is identified by the posterior commissure, expressed Pax6, 
as in gnathostomes* (Fig. 3a, k and Extended Data Fig. 2i, t). Thus, in 
the hagfish, we clearly identified the hypothalamus and diencephalic 
prosomeres (pretectum (p1), thalamus (p2) and prethalamus (p3)) 
rostral to the midbrain, with the ZLI between p2 and p3. Finally, the 
mesencephalon was defined as a Pax6-negative region caudal to the 
posterior commissure (Fig. 3a, h, k and Extended Data Fig. 2i, t). 

Notably, and unlike in lampreys”*, we detected in the stage 53 
E. burgeri embryo an independent and conspicuous expression domain 
of Nkx2.1/2.4, dorsal (rostral according to conventional columnar 
model’) to the optic chiasm, corresponding to the region where the 


98 | NATURE | VOL 531 | 3 MARCH 2016 


Internal view 


External view 


Figure 2 | Development of the ventricular system in the hagfish brain. 
a-h, Three-dimensional images reconstructed from serial sections of 

E. burgeri embryos with Avizo software. External (a, c, e and g) and 
internal (b, d, f and h) views are shown. The central nervous systems are 
in purple; ectoderm is in light blue and green; endoderm is in yellow; and 
notochord is in orange. The developmental stages are based on ref. 18. 
Identification of brain regions is based on brain morphology, nerve 
staining and comparative gene expression patterns (Fig. 3 and Extended 
Data Figs 2 and 3). At stage (st.) 53, lateral ventricles can be divided into a 
telencephalic and a diencephalic part (lv(t) and Iv(d), respectively)’. e, eye; 
ec, ectoderm; en, endoderm; f, forebrain; lv, lateral ventricle; mo, mouth 
opening; n, notochord; ne, nasal epithelium; nhp, naso-hypophyseal plate; 
no, nasal opening; os, optic stalk; ot, otic vesicle; opm, oropharyngeal 
membrane; pp, pharyngeal pouches; som, secondary oral membrane; 

3v, third ventricle. See Fig. 1 for other abbreviations. Scale bars, 200 jum. 


gnathostome MGE would form” (Fig. 3g). Moreover, Hh2 was also 
expressed in a slightly restricted area of this Nkx2.1/2.4-positive 
domain, possibly corresponding to a part of the MGE and preoptic 
area‘ (Fig. 3f). Therefore, the putative MGE in the hagfish arises as 
a conspicuous region within the brain, located in the rostral part of 
the telencephalon (ventral according to the conventional columnar 
model’), and possibly associated with the lateral ganglionic eminence 
posterior to it (Fig. 3m). 

In modern gnathostomes, the cerebellum differentiates from the 
dorsal part of rhombomere |, whereby granular cells are supplied from 
the rhombic lip to constitute a layer of the cerebellar cortex*. The rhom- 
bic lip in gnathostomes expresses Pax6, which enables granular cells 
to migrate”®. Therefore, regional specification of the rhombic lip is a 
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Figure 3 | Gene expression and nervous staining in E. burgeri at stage 
53. a, h, Immunohistochemical staining of the axon bundles by anti- 
acetylated tubulin antibody (green) and DAPI (blue). b, i, Haematoxylin 
and eosin (HE) staining. c—g, j-I, Gene expression patterns involved 

in the regionalization of the forebrain. Arrowheads in f and g indicate 
expressions in the MGE region of the telencephalon. m, Schematic 
drawing of gene expressions in the telencephalon. The telencephalic 
territory can be identified from the expressions of FoxG, Pax6 and 
EmxB*!, cyl, commissura vestibulo-lateralis; di, diencephalon; 

ibc, interbulbar commissure; on, olfactory nerve. See Fig. 1 for other 
abbreviations. Scale bars, 500 um. 


prerequisite to acquisitioning of the cerebellum. However, an overt 
cerebellum is missing in cyclostomes, although there are fibres pos- 
sibly equivalent to commissures posterior to the midbrain-hindbrain 
boundary in gnathostomes’ (Fig. 1). In the lamprey, previous studies 
have deduced the absence of a rhombic-lip-derived cerebellum from 
the lack of Pax6 expression in the corresponding area’. Unlike in the 
lamprey, the hagfish rhombencephalon clearly develops a Pax6-positive 
rhombic lip, as in gnathostomes (Fig. 4a—d). This was confirmed by the 
expression of a different rhombic lip marker, Atoh1, which is involved 
in the development of excitatory cells in the gnathostome cerebellum, 
in the hagfish dorsal hindbrain”” (Extended Data Fig. 7n). 

Our results show that the hagfish and gnathostome embryonic brains 
share a common regional patterning, leading us to two scenarios: 
(1) the hagfish/gnathostome genoarchitecture represents an ancestral 
programme for all vertebrates, and lampreys have lost the MGE and rhom- 
bic lip secondarily; or (2) the ancestral brain developmental pattern did 
not have an MGE or rhombic lip, which were acquired independently 
in the lineages of hagfish and gnathostomes by convergent evolution. To 
clarify this, we have reinvestigated brain development of the Japanese 
lamprey (Lethenteron japonicum) with a recently reported draft genome 
sequence’, We found two extra lamprey Nkx2.1 orthologous genes 
(Nkx2.1/2.4B and Nkx2.1/2.4C; see Methods, Extended Data Figs 5 
and 8 and Extended Data Table 1), as well as one extra Hedgehog par- 
alogue, HhD, in addition to the previously reported Nkx2.1 (renamed 
as Nkx2.1/2.4A) and HhA-C genes’ (Extended Data Figs 5 and 9). 
Surprisingly, lamprey Nkx2. 1/2.4B and Nkx2.1/2.4C were expressed 
in a rostral domain within the subpallium, suggesting the presence of 
an MGE in the lamprey (Fig. 4g, h and Extended Data Fig. 9). On the 
other hand, no HhD expression was detected (Extended Data Fig. 9), 
suggesting that, although present, specification of the MGE in the lam- 
prey is still somehow divergent compared with other vertebrates. 
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Figure 4 | Presence of the rhombic lip and MGE in the hagfish and 
lamprey brains, and an evolutionary model of the vertebrate brain. 
a-f, Pax6 expression in the hindbrain of a stage 31 catshark embryo (a, b) 
and a hagfish stage 53 embryo (c, d). Atoh1 expression in a lamprey stage 
26 embryo (e, f). Sagittal views (a, c, e) and transverse sections (b, d, f) 
are shown. Arrowheads indicate expressions in the rhombic lip. Inset 

in c shows rhombic lip expression on a lateral section. g, h, Nkx2.1/2.4 
homologue gene expression patterns in a lamprey stage 27 embryo. 
Arrows indicate telencephalic expression of Nkx2.1/2.4B and C. 

i, Schematic drawing of embryonic hagfish and lamprey brains (stages 53 
and 26, respectively) as revealed by the present study. j, An evolutionary 
model of the vertebrate brain. DP, dorsal pallium; fp, floor plate; 

VP, ventral pallium. See Figs 1-3 for other abbreviations. Scale bars, 

500 1m (a-d) and 100,1m (e-h). 


Next, we looked for the presence of a rhombic lip in L. japonicum 
larvae. Lamprey Atoh1, Wnt1 and Ptfla genes, as well as the recently 
identified lamprey Pax6B gene’, were expressed in a dorsoventrally 
organized pattern in the hindbrain (Fig. 4), indicating specification 
of a rhombic-lip-like property as in gnathostomes, although Pax6B 
expression was found more ventrally than in gnathostomes (Extended 
Data Fig. 7s). Therefore, as in the case of the MGE, lamprey rhombic 
lip specification is divergent from the gnathostome condition. Notably, 
because the gnathostome Atoh1 and Ptfla genes are involved in the 
development of the cerebellum granule and Purkinje cells, respec- 
tively’’, certain genetic backgrounds underlying the acquisition of the 
cerebellum proper in gnathostomes, yet absent in cyclostomes, might 
have been established before the splitting of the two groups (Fig. 4). 
Altogether, the lamprey developmental pattern suggests the presence 
of a rhombic lip domain (Fig. 4 and Extended Data Fig. 7). 

The absence of certain domains in the lamprey brain once led 
researchers to examine a model in which the brain evolution followed 
a stepwise elaboration towards crown gnathostomes. However, we 
show here that two major domains of the vertebrate brain, previously 
regarded as gnathostome novelties, are indeed present in cyclostomes 
as distinct gene expression domains. These domains would have fol- 
lowed further subsequent elaboration in each lineage. For instance, 
only the gnathostome rhombic lip led to the cerebellum proper as a 
gnathostome novelty (Fig. 4j). That part would probably have preceded 
acquisition of a jaw, because jawless stem gnathostomes, such as ostra- 
coderms, probably had a morphologically distinct cerebellum*”. Thus, 
our findings markedly alter the accepted evolutionary model (Fig. 4) 
and suggest that the vertebrate brain genoarchitecture, with regional 
gene expression patterns as seen in crown gnathostomes, is much 
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older than previously thought, dating back to more than 500 million 
years ago when the latest common ancestor of crown vertebrates 
arose (Fig. 4). 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized, and investigators were not blinded to allocation during 
experiments and outcome assessment. 

Sample collection and aquarium maintenance of hagfish, lamprey and shark. 
Mature males and females of the Japanese inshore hagfish (E. burgeri), were 
obtained in the Japan Sea off Shimane prefecture as described*!. Eggs were depos- 
ited in a cage set on the sea bottom in October 2011. Deposited eggs were then 
collected and incubated in an aquarium until embryos reached the required stages. 
Embryos were fixed with 4% paraformaldehyde in late March 2012. We mainly 
used two embryos (at stages 45 and 53) in this study. Embryos used in Fig. 4d and 
Extended Data Fig. 7m, n were obtained previously!®'®. Sources of the Japanese 
lamprey (L. japonicum; recently included as a synonym for L. camtschaticum), 
were as described**. Lamprey embryos were staged as described® and fixed in 4% 
paraformaldehyde. Adult catsharks (S. torazame), were obtained in Ibaraki pre- 
fecture, Japan, and kept in aquarium tanks. Deposited eggs were incubated in the 
same tanks at 16 °C until embryos reached the required stages. Catshark embryos 
were staged and fixed with 4% paraformaldehyde as described™. The sampling and 
experiment were conducted according to the institutional and national guidelines 
for animal ethics. 

Gene identification, cDNA cloning and sequencing. For the identification of 
hagfish Hedgehog, Atoh1 and catshark Atoh1, Ptfla and Wnt1 genes, the nucleo- 
tide sequences were obtained from embryonic transcriptomes of E. burgeri and 
S. torazame, respectively, and assembled from an in-house RNA-sequencing 
(RNA-seq) data set (to be published elsewhere). Hagfish FoxG sequence was 
retrieved from the Vertebrate Time Capsule database*°. In all cases, we per- 
formed TBLASTN using as queries the amino acid sequences of corresponding 
orthologues from human, mouse, chicken and zebrafish (plus L. japonicum HhA, 
-B, -C and the amphioxus Branchiostoma floridae Hh gene in the case of hagfish 
Hedgehog). We found four putative hagfish Hh genes, including the previously 
reported Hh1 (ref. 18), and designated the three new genes as Hh2-4 arbitrarily. 
The Lamprey Nkx2.1/2.4, Atoh1, Ptfla and Wnt1 genes were identified by 
means of TBLASTN using human, mouse, chicken and zebrafish counterparts 
as queries against L. japonicum low-coverage draft genome sequence version 
LetJap1.0 (ref. 28; accession number APJL00000000, http://jlampreygenome. 
imcb.a-star.edu.sg/). We found three Nkx2.1 or Nkx2.4 gene-containing scaf- 
folds: 242 (GenBank accession number KE993913, containing the previously 
reported TTF-1; ref. 36), 17 (KE993688) and 73 (KE993744). We found lam- 
prey Wnt1 in scaffold 103 (KE993774), two Ptfla paralogues in scaffolds 196 
(Ptfla-A; KE993867) and 473 (Ptfla-B; KE994144), and Atoh1 in scaffold 280 
(KE993951). The corresponding gene sequences were predicted by means of 
GeneWise’” (http://www.ebi.ac.uk/Tools/psa/genewise/) or GENSCAN*® 
(http://genes.mit.edu/GENSCAN.html) and curated manually. Phylogenetic 
and synteny conservation analyses (see below) could not resolve the orthol- 
ogy relationships between Nkx2.1 and Nkx2.4, so we designated these genes as 
Nkx2.1/2.4A (TTF-1, scaffold 242), Nkx2.1/2.4B (scaffold 17) and Nkx2.1/2.4C 
(scaffold 73). As in gnathostomes*’, Nkx2.1/Nkx2.4A and C were linked to the 
Nkx2.2 and Nkx2.8 genes, which were designated as Nkx2.2/2.8A and C, respec- 
tively. A partial sequence of a third putative Nkx2.2/2.8B gene was found in 
contig 79345 (accession number APJLO1110193). The shark FoxG1 homolo- 
gous gene was cloned using degenerate primers designed based on the consen- 
sus amino acid sequences of vertebrate FoxG1 counterparts (forward primer, 
5’-ATGGGNGANMGNAARGA-3'; reverse, 5/-NCCYTGRTTYTGRTGNGT-3; 
sequence based on IUPAC ambiguity code). An E. burgeri OtxC gene fragment 
was obtained using specific primers based on the nucleotide sequence of the 
Atlantic hagfish (Myxine glutinosa) OtxC gene“. 

Total RNAs of E. burgeri and S. torazame were extracted from whole embryos 
using TRIZOL reagent (Invitrogen Life Sciences). Reverse transcription PCR 
(RT-PCR) was performed to amplify fragments of each gene with specific 
primers. Lamprey gene fragments were obtained from a stage-24 cDNA library 
(ZAP-cDNA Synthesis Kit, 200400, Stratagene). PCR products were cloned 
into the pCRII-TOPO vector (Invitrogen) and subsequently sequenced with a 
3130 Sequence Analyzer (Applied Biosystems). The 5’ and 3’ ends were ampli- 
fied with GeneRacer kits (Invitrogen Life Sciences), or from the lamprey cDNA 
library with customized primers. The cDNA sequences identified here have been 
deposited in GenBank under accession numbers LC028239-43, LC028245 and 
KT897926-27 and KT897930-38. Shark Pax6 (AB773851), DIx2 (AB293582), Shh 
(AB247650), Nkx2.1 (AB773852), hagfish Pax6 (AB270704), EmxA (AB935431), 
EmxB (AB935432), Nkx2.1 (AB747372), Hh1 (AB703680) and Fe/8/17 (AB703681) 
and lamprey Pax6 (AB061220), Pax6B (supplementary table 3 of ref. 29), DIxA 
(AB292628), HhA (AB124584), HhB (AB583549) and HhC (AB583548) were 
obtained previously”18>41-®, 
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In situ hybridization. In situ hybridization on paraffin wax-embedded sections 
was performed as described”. In brief, deparaffinized sections (8-10,1m) were 
treated with Protease K (2-10j1g ml‘) at room temperature for 5-15 min. Slides 
were incubated in hybridization buffer (50% formamide, 5 x SSC (pH_7.0), 1% SDS, 
501g ml total yeast RNA, 501g ml! heparin sulphate, 5 mM EDTA (pH8.0), 
0.1% CHAPS and 1% blocking reagent (Roche)) with 0.2-1.0gml~! DIG-labelled 
RNA probe overnight at 65 °C. For immunohistochemical detection, sections were 
incubated with alkaline phosphatase (AP)-conjugated anti-DIG Fab fragments 
(diluted 1:4,000) for 2h after blocking. Colour development was performed using 
the BM purple AP substrate (Roche). Whole-mount in situ hybridization of lam- 
prey and shark embryos was performed according to refs 32 and 41, respectively. 
Immunohistochemical staining. Deparaffinized sections of hagfish embryos were 
blocked with 5% skim milk in TBST (TSTM). They were then reacted overnight 
at room temperature with an anti-acetylated tubulin antibody (Sigma-Aldrich, 
T6793) diluted in TSTM (1:1,000). After the samples had been washed with TBST, 
they were reacted for 2h with Alexa Fluor 488-conjugated goat anti-mouse anti- 
body (Invitrogen Life Sciences, A11001) diluted in TSTM (1:400). After the final 
wash in TBST, embryos were observed using an Axio Zoom V16 fluorescence 
microscope (Carl Zeiss). 

3D reconstruction of the hagfish brain. The 3D images of the ventricular system 
of hagfish embryos were reconstructed based on the data set used previously'®. 
Reconstructed images were acquired using Avizo software (Visualization Sciences 
Group). 

Molecular phylogenetic analysis. The molecular phylogenetic trees of Hh, 
Nkx2.1/2.4, Fgf8/17, FoxG, Wnt1, Atoh1 and Ptfla genes of E. burgeri, L. japonicum 
and S. torazame were inferred with the maximum-likelihood method using 
PhyML3.0 (http://www.atgc-montpellier.fr/phyml/download.php) with the JTT+ 
G4 model. 

Synteny conservation analysis. According to ref. 44, comparisons of genomic 
regions including vertebrate Nkx2. 1/2.4 were performed among the following 
species: human (Homo sapiens), chicken (Gallus gallus), coelacanth (Latimeria 
chalumnae), western clawed frog (Xenopus tropicalis), elephant shark 
(Callorhinchus milii) and Japanese lamprey (L. japonicum). In brief, gnathostome 
Nkx2.1/2.4 orthologues and the adjacent protein-coding genes were arranged 
using the UCSC (http://genome.ucsc.edu) and Ensembl (http://www.ensembIl. 
org/) genome browsers. For lamprey, genome scaffolds 17 (GenBank accession 
KE993688), 73 (KE993744) and 242 (KE993913) containing the Nkx2.1/2.4 homo- 
logues (see above) were analysed using the gene prediction tool, GENESCAN*. 
Subsequently, deduced amino acid sequences of the predicted genes were used 
as queries for NCBI TBLASTN searches and annotated with an E-value cut-off 
of 1x 10~°, 
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Extended Data Figure 1 | Embryonic brain of a shark. a-~q, Gene Nkx2.1 (d, j, p) and Shh (e, k, q). f, 1, Confocal images of the central and 
expression patterns and nerve tracts in a stage-27 (ref. 34) embryo of peripheral nervous systems visualized by immunofluorescence using an 
the cloudy catshark (S. torazame), to show the conserved developmental anti-acetylated tubulin antibody (green) and DAPI (blue). r, Schematic 
pattern of the jawed vertebrate brain. Whole-mount views (a-e), sagittal drawing of the shark telencephalon, showing conserved gene expression 
sections of the brain (g-k) and transverse sections at the telencephalon patterns among jawed vertebrates. Some of these patterns were described 
level (m-—q; the sectioned region is shown by the dotted line in g), previously’. op, olfactory placode; rp, Rathke’s pouch. See Figs 1 and 2 for 
stained using probes for FoxG1 (a, g, m), Pax6 (b, h, n), Dlx2 (c, i, 0), other abbreviations. Scale bars, 500 1m. 
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embryo observed using histological preparations. Dotted lines indicate 

the presumptive telencephalic border estimated by EmxB and FoxG 
expressions”!. At this stage, Nkx2.1/2.4 and Hh2 have not been expressed 
yet in the rostral telencephalon (according to the prosomeric model’). 

cer, cerebrum; met, metencephalon; mes, mesencephalon; ob, olfactory 
bulb. See Figs 1, 2 and 4 for other abbreviations. Scale bars, 2.0 mm 

(a-c and f), 1.0mm (e, h) and 500 1m (i, j). 


Be 5 oat 
Extended Data Figure 2 | Adult and embryonic brains of E. burgeri, 
and gene expression and nervous system staining of stage-45 embryo. 
a, b, Dorsal view (a) and sagittal section (b) of the adult brain of E. burgeri. 
c-h, E. burgeri embryos at Bashford Dean stages 45 (c-e) and 53 (f-h)!718, 
Dorsal (c, f) and lateral (d, g) views are shown. e, h, Sagittal sections 
stained with haematoxylin and eosin. Dean stage 53 is very similar to 
the embryo described in figure 1 in ref. 45. i-t, A stage 45 E. burgeri 
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Extended Data Figure 3 | Identification of the forebrain-midbrain- 
hindbrain boundaries in early hagfish embryos. a—c, The mid-hindbrain 
boundary (MHB) of the hagfish brain is assumed to correspond to the 
posterior expression boundary of OtxC (b) and focal expression of Fgf8/17 
(see Extended Data Fig. 5) (c) that become evident at stages 28-30. 

a-d, On the other hand, the fore-midbrain boundary (FMB) is suggested 
morphologically (a, b) and by the ventrocaudal portion of Nkx2.1/2.4 
expression domain (d, arrowheads). See Figs 1 and 2 for abbreviations. 
Scale bar, 200 um. 
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Extended Data Figure 4 | Absence of the epiphysis and presence of the 
ZLI in the developing brain of the hagfish. a, b, The adult hagfish has no 


14,46 


epiphysis. A rudimentary epiphysis described previously'**° was likely to 
have been an artefact derived from an inappropriate method of fixation. 
Although von Kupffer (1900) described an epiphysis-like structure in 

his illustration (a, b, redrawn from ref. 15), he denied its identity as an 
epiphysis because it is distantly positioned from the posterior commissure 
(pc). ¢, The position of the epiphysis (ep) in a shark embryo is shown. 
d~g, In our present observation of hagfish embryos also, several 


neuroepithelial cysts were found in the forebrain (arrows in d-f), which 
were always associated with the development of blood vessel formation in 
the brain (arrowhead in g). Arrowheads in e indicate the crack between 
the neuroepithelial cyst and the brain tissue caused during the dehydration 
process, reminiscent of von Kupffer’s epiphysis-like structure in b. 

h-k, Hh2 expression in stage-45 (h, i) and stage-53 (j, k) hagfish embryos. 
Medial sections (h, j) and lateral sections (i, k). Dotted lines iniandk 
indicate the presumptive ZLI region. See Figs 1-3 for abbreviations. 

Scale bars, 500 um. 
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Extended Data Figure 5 | Molecular phylogenetic trees of genes 
identified in this study. The trees were inferred with the maximum- 
likelihood method using PhyML3.0 with the JTT+G4 model. Members 
of individual gene families were collected from public databases, and 
groupings of the sequences identified in this study into the subfamilies 
shown in this figure were confirmed by preliminary phylogenetic 
inferences including other subfamilies. a, Hedgehog homologues 

(shape parameter for the gamma distribution alpha = 0.80). A total of 
249 amino acid sites unambiguously aligned without any gap were used 
in the inference. b, Nkx2.1/2.4 homologues (86 sites; alpha = 0.60). 
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97/96 —— human Fgf17 
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e271 stickleback ENSGACP00000004483 
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stickleback ENSGACP00000005011 
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coelacanth XP_006012032 
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stickleback ENSGACP00000023676 


coelacanth Fgf18 
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0.2 substitutions / site 


c, Wnt1 homologues (196 sites; alpha = 0.60). d, Ptfla homologues 

(67 sites; alpha = 0.92). e, Atoh1 homologues (60 sites; alpha = 0.74). 

f, Fgf8/17/18/24 homologues including the hagfish homologue identified 
previously’? (74 amino acid sites; alpha = 0.87). g, FoxG1 homologues 
(115 amino acid sites; alpha = 0.16). Hagfish, lamprey and catshark 
sequences are shown in red, green and blue, respectively. Support values 
at nodes are bootstrap probabilities in the maximum-likelihood method 
and those in the neighbour-joining method (under the above-mentioned 
substitution model), in order. 
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H Hh7 Hh2 Hh3 Hh4 
st.45 st.53 st.45  st.53 st.45  st.53  st.45 — st.53 


Hypothalamus + + + + nd = nd nd 


Floor plate + + + + + + = nd 


Pharyngeal endoderm + + + + + - a nd 


+, present; —, absent; nd, no data 


Extended Data Figure 6 | Expression patterns of hagfish Hedgehog (hypothalamus, ZLI and MGE; see Fig. 3). The overall expressions of 
genes. a-i, In situ hybridization staining of four Hedgehog genes (Hh1-4) hagfish Hh1-4 exhibited patterns similar to those of Hedgehog genes in 

of E. burgeri at stage-45 (a—f) and stage-53 (g-i) embryos. The Hedgehog jawed vertebrates, although each Hh expression showed slight differences 
expressions were found in notochord (n), floor plate (fp), pharyngeal in some domains. j, Summary of the expression patterns of Hedgehog genes 
endoderm (pe), oral ectoderm (oe) and subregions of the forebrain (Hh1-4) of E. burgeri in stage 45 and stage 53 embryos. Scale bars, 500 im. 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


Pax6 


Wht Atoh1 


Ptfta 


Shark 


Hagfish 


Lamprey 


Extended Data Figure 7 | Rhombic lip gene expression patterns in 
shark, hagfish and lamprey embryos. Gene expression patterns in 
stage-27 (whole-mount) and stage-31 (transverse sections) embryos of 
a catshark (S. torazame) (a-l), stage-53 embryo of a hagfish (E. burgeri) 
(m, n), and stage-26 embryo of lamprey (L. japonicum) (o-t), stained 
using probes for the Pax6, Wnt1, Atoh1 and Ptfla gene homologues. 

u, Schematic transverse section of the vertebrate rhombic lip showing 


Ath ventricle 


Pax6 


crucial gene expression patterns based on ref. 27 and the present study. 


Cerebellar Purkinje cells and other 
inhibitory cells development 

II ule cells and other 

ells development 


Lines in a indicate the levels of the transverse sections shown in 

e-h (rhombomere 1: the cerebellar primordia) and i-I (posterior 
rhombomeres). The line in 0 indicates the level of the transverse sections 
shown in (o/-s’) (around rhombomere 4). Arrowheads indicate rhombic 
lip gene expressions. Asterisks indicate non-rhombic lip expressions of 
Pax6 through the neural tube. See Figs 1 and 2 for abbreviations. 

Scale bars, 500 um (a, e, i, m, 0) and 100m (0’). 
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similarity to two gnathostome ohnologues as a result of the TBLASTN 
search. The green line represents orthology between the neighbouring 
vertebrate genomes, whereas the purple line indicates a paralogous 
relationship of each gene among the lamprey genome scaffolds. At present, 
there are no descriptions of chicken NKX2-4 (nor in birds) and we have 
not been able to find it. This putative lack is marked by a question mark. 


Extended Data Figure 8 | Synteny conservation between the genomic 
regions containing Nkx2.1/2.4 among gnathostomes and L. japonicum. 
Numbered boxes represent single protein-coding genes; all the genes 
used for synteny analysis and the given numbers are listed in Extended 
Data Table 1. Colours of boxes are as follows: red, Nkx2-1/2-4 orthologue; 
orange, Nkx2-2/2-8 orthologue; grey, a gene showing high sequence 
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Extended Data Figure 9 | Telencephalic gene expression patterns in reported Nkx2.1/2.4A is not expressed in the telencephalon*®, Nkx2.1/2.4B 
lamprey embryos. a—p, Gene expression patterns in L. japonicum stage-26 and Nkx2.1/2.4C are expressed in the rostral telencephalon (arrowheads), 
and -27 (ref. 33) embryos, stained using probes for Nkx2.1/2.4A (a, b), suggesting the presence of the MGE in the lamprey. We did not detect 
Nkx2.1/2.4B (c, d), Nkx2.1/2.4C (e, f), Pax6 (g), DIxA (h), HMA (i, j), expression of any Hedgehog genes in the rostral telencephalon”® (i-p). 
HhB (k, 1), HhC (m, n) and HhD (0, p). Dotted lines indicate the q; Summary of the expression patterns of Nkx2.1/2.4 and Hedgehog genes 
telencephalic border distinguished by the anterior intraencephalic sulcus”. _ of L. japonicum at embryonic stages 17-30 examined by whole-mount 
The expression domains of Pax6 (g) and D/xA (h) in the telencephalon in situ hybridization. en, endostyle; ptu, posterior tuberculum*”. 
pallial and subpallial subdivisions, respectively. Although the previously See Figs 1 and 2 for other abbreviations. Scale bar, 200 1m. 
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Extended Data Table 1 | List of genes used in the synteny conservation analysis 


Number 


OONDOARWDY = 


Each gene number corresponds to the numbers in Extended Data Fig. 8. 


Nkx2.1 & Nkx2.8 group 


Gene name 


SNX6 

CFL2 

BAZ1A 
ENSLACG0000001 7921 
SINCAMG00000016188 
ENSGALG0000001 0034 
SRP54 

FAM177A1 

PPP2R3C 

KIAA0391 

PSMA6 

NFKBIA 

INSM2 

RALGAPAT1 

BRMS1L 

MBIP 

SFTA3 

NKX2.1 

NKX2.8 
ENSGALG0000001 0113 
PAX9 

SLC25A21 
ENSLACG0000001 4546 
ENSLACG0000001 4339 
MIPOL1 

FOXA1 
ENSGALG0000001 01 21 
TTC6 

SSTR1 

CLEC14A 

CAPN3 

SEC23A 

HRH2 
SINCAMG00000016299 
GEMIN2 

TRAPPC6B 

PNN 

MIA2 

CTAGES 

FBXO33 

CHST9 

ITPKA 

ZNF106 

EAPP 


Nkx2.4 & Nkx2.2 group 

Number Gene name 

45 SLC24A3 

46 RIN2 

47 NAA20 

48 CRNKL1 

49 CFAP61 

50 GTF2IRD2 

51 INSM1 

52 RALGAPA2 

53 ATP2A1 

54 ENSXETG00000032786 

55 TCTN3 

56 PPP1CA 

57 SINCAMGO00000008965 

58 ENSXETG00000009758 

59 ENSLACG00000009798 

60 ENSLACG00000001 348 

61 KIZ 

62 XRN2 

63 NKX2.4 

64 NKX2.2 

65 PAX1 

66 FOXA2 

67 SSTR4 

68 THBD 

69 SINCAMGO00000008008 

70 SFT2D1 

71 PRR18 

72 CFAP36 

73 SMEK2 

74 PNPT1 

75 EFEMP'1 

76 CCDC85A 

77 VRK2 

78 FANCL 

79 ENSXETG00000020482 

80 ENSLACG00000017923 

81 ENSLACG00000017898 

82 CD93 

83 NXT1 

84 GZF1 

85 ENSGALG00000008350 

86 NAPB 

87 NUDT18 

88 PRDX5 
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Late acquisition of mitochondria by a host with 
chimaeric prokaryotic ancestry 


Alexandros A. Pittis!? & Toni Gabaldén!2? 


The origin of eukaryotes stands as a major conundrum in biology'. 
Current evidence indicates that the last eukaryotic common ancestor 
already possessed many eukaryotic hallmarks, including a complex 
subcellular organization! >. In addition, the lack of evolutionary 
intermediates challenges the elucidation of the relative order of 
emergence of eukaryotic traits. Mitochondria are ubiquitous 
organelles derived from an alphaproteobacterial endosymbiont*. 
Different hypotheses disagree on whether mitochondria were 
acquired early or late during eukaryogenesis°. Similarly, the nature 
and complexity of the receiving host are debated, with models 
ranging from a simple prokaryotic host to an already complex 
proto-eukaryote!*©’, Most competing scenarios can be roughly 
grouped into either mito-early, which consider the driving force of 
eukaryogenesis to be mitochondrial endosymbiosis into a simple 
host, or mito-late, which postulate that a significant complexity 
predated mitochondrial endosymbiosis*. Here we provide evidence 
for late mitochondrial endosymbiosis. We use phylogenomics to 
directly test whether proto-mitochondrial proteins were acquired 
earlier or later than other proteins of the last eukaryotic common 
ancestor. We find that last eukaryotic common ancestor protein 
families of alphaproteobacterial ancestry and of mitochondrial 
localization show the shortest phylogenetic distances to their 
closest prokaryotic relatives, compared with proteins of different 
prokaryotic origin or cellular localization. Altogether, our results 
shed new light on a long-standing question and provide compelling 
support for the late acquisition of mitochondria into a host that 
already had a proteome of chimaeric phylogenetic origin. We 
argue that mitochondrial endosymbiosis was one of the ultimate 
steps in eukaryogenesis and that it provided the definitive selective 
advantage to mitochondria-bearing eukaryotes over less complex 
forms. 

Previous analyses infer a last eukaryotic common ancestor 
(LECA) proteome of diverse phylogenetic origin'®. Notably, only 
a fraction of the proteins of bacterial descent can be traced back to 
Alphaproteobacteria, the group from which mitochondria originated*. 
Attempts to explain alternative bacterial signals in LECA range from 
invoking horizontal gene transfer (HGT), phylogenetic noise or addi- 
tional symbiotic partners””®, including the possibility that part of this 
diversity could have already been present in the putative archaeal host'!. 
Resolving whether LECA proteins of bacterial descent were acquired 
in bulk is key to testing competing eukaryogenesis models. Here, we 
set out to assess whether the LECA proteins with alphaproteobacterial 
ancestry show distinct patterns in terms of their current cellular locali- 
zations, and evolutionary distances to their closest ancestors, compared 
with LECA proteins of other descent. For this, we surveyed the phyloge- 
netic signal of inferred LECA proteomes (see Methods). First, the likely 
phylogenetic origin of each LECA family was assessed by evaluating the 
taxonomic distribution of prokaryotic sequences present in its closest 
neighbouring tree partition (see Methods and Fig. 1a). We then estab- 
lished a measure of phylogenetic distance for the branch subtending the 


LECA family and connecting it to the last prokaryotic ancestor shared 
with its closest prokaryotic relatives (raw stem length; Fig. 1a). Branch 
lengths indicate the number of inferred substitutions per site, which 
reflect both divergence time and evolutionary rate. To disentangle time 
from rates, which may vary across families, we normalized the raw 
stem length by taking into account the median of the branch lengths 
within the LECA family (see Methods for further details). We used this 
measurement (hereafter referred to as stem length) as a proxy for the 
phylogenetic distance between a given LECA protein family and its last 
shared ancestor with prokaryotes. Competing mito-early and mito-late 
hypotheses naturally differ in their expectations of stem lengths for 
proteins of proto-mitochondrial origin compared with those of other 
putative origins. In a simple fusion model, with the proto-mitochon- 
drion contributing most of the bacterial component, one would expect 
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Figure 1 | Stem length analysis. a, Schematic representation of the 
inference of the phylogenetic origin of LECA groups and the measured 
phylogenetic distances. First monophyletic groups of eukaryotic proteins 
that passed the required thresholds were considered as protein families 
present in LECA (purple box). The taxonomic range of the proteins 
present in the closest neighbouring tree partition (sister group, blue 

box) was used to define the putative phylogenetic origin of the LECA 
family. Distance to the common ancestor with the closest prokaryotic 
neighbouring group was measured (raw stem length, rsl) and normalized 
(stem length, sl) by dividing it by the median of the distances from the 
eukaryotic terminal nodes to the last common ancestor of all eukaryotic 
sequences (eukaryotic branch length, ebl). b, Subpopulation distributions 
within the overall stem length distribution (inset) as defined by a 
mixture model and the expectation-maximization algorithm. The four 
subpopulations/components are over-represented in different prokaryotic 
phylogenetic groups of origin, Gene Ontology (GO) and clusters of 
orthologous groups (COGs) functional category annotations (see text, 
Table 1 and Supplementary Tables 1 and 2). On top of these components, 
we represent the cellular localizations for which each family class is 
enriched. FECA, first eukaryotic common ancestor. 
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Table 1 | Over-represented phylogenetic origins, GO terms and functional categories in the different components 


Component Size Phylogenetic origin Cellular localization Cellular function 
Group N P GO cellular component WN P Functional category N P 
1 452 Bacteria 388 <1x10- Mitochondrion 150 <1x10-® Amino-acid transport and 72 1.8x10-4 
metabolism 
Alphaproteobacteria 49 11x10-4 Energy production and 45 1610-2 
conversion 
Chlamydiae/Verrucomicrobia 19 1.4x 10-2 Coenzyme transport and 29. 4910-7 
group metabolism 
Deltaproteobacteria 29° 2.0x 10-7 
2 284 - Endoplasmic reticulum 32 3.5x10-% Carbohydrate transportand 28 3.7x10-2 
metabolism 
Golgi apparatus 1 4110-2 
Extracellular space 14x 10-4 
3 234 Archaea 80 <1x10-© Nucleoplasm 3 2.7x10°3 Replication, recombination 24 61x10-+ 
and repair 
Euryarchaeota 30 1.3x10-* Nucleus 80 5.9x10-° Translation, ribosomal 46 66x10? 
structure and biogenesis 
Crenarchaeota 15 3.4x10-3 Chromosome 4 74x10-3 Transcription 10 4910-2 
Korarchaeota 7 1.2x10-? Nuclear 9 24x10? 
chromosome 
Actinobacteria 16 2.7x10-* Nucleolus 9 2.5x10-4 
Protein complex 46 2.9x10-2 
4 94 Archaea 41 <1x10-® Ribosome 24 <1x10-° Translation, ribosomal 36 <1x10-6 
structure and biogenesis 
Thaumarchaeota 8 49x10“ Cytosol 39 <1x10°° 
Euryarchaeota 16 14x10? Organelle 70 1.7x10-? 
Crenarchaeota 7 2.7x10-? Nucleolus 10 14x 10-4 


N, number of LECA families per term, in each component. P values <10~ reflect value O in 10° permutations. 


stem lengths of bacterial-derived proteins to be similar. In contrast, 
significant differences would be predicted by models involving different 
waves of gene acquisition. We assessed differences in stem length, pro- 
tein function and subcellular localization across 1,078 LECA families 
of different origins. 

We first used an unsupervised approach to assess whether the dis- 
tribution of stem lengths in LECA families was homogeneous. By 
using the expectation-maximization algorithm”? to fit observed data 
to a mixture model, we inferred four distinct underlying distributions 
(Fig. 1b), each containing a subset of LECA families. We asked whether 
each underlying distribution contained an enrichment of protein fam- 
ilies with (1) a particular taxonomic origin, (2) a particular subcellular 
localization or (3) a particular functional category. Notably, we found 
that the first component (shortest stems) was enriched in families with 
bacterial origins (most particularly alphaproteobacterial), mitochon- 
drial localization and energy production (see Table 1). In contrast, 
the two components with the longest stems (third and fourth) were 
enriched in families of archaeal and actinobacterial origins, and in 
annotations related to the nucleus and ribosomes (Fig. 1b and Table 1). 
The second component showed no enrichment in any ancestry, but 
a significant enrichment in endomembrane system localization. The 
above results are only consistent with mito-late models, with the 
archaeal contributions to eukaryotes, mainly associated with nuclear 
structures and genes related to informational processes (replication, 
transcription, translation), being more ancient; with the prokaryotic 
proteome of the endomembrane system being integrated later; and 
with the alphaproteobacterial contribution, associated with mito- 
chondria and energy production, appearing later than other bacterial 
components. 

We tested this hypothesis more directly by grouping the LECA 
families by their inferred phylogenetic origin, and by their func- 
tional and subcellular localization annotations, and then testing 
whether their respective stem lengths were significantly different 
(Fig. 2a and Extended Data Fig. la—c). Overall, LECA families of 
bacterial origins have significantly shorter stems than families of 
archaeal origin (P= 1.38 x 10~*°, two-sided Mann-Whitney U-test). 
Importantly, eukaryotic families of alphaproteobacterial descent 
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showed the shortest stems, together with families pointing to the 
Verrucomicrobia/Chlamydiales group. These lengths were signifi- 
cantly smaller than those found in LECA families of different bacterial 
origins (P= 4.4 x 10-7). When grouping LECA families according to 
their functional annotations, we found that those involved in infor- 
mational processes had the longest stems, followed by those involved 
in cellular and signalling processes, with families involved in met- 
abolic processes showing the shortest stems (Fig. 2c and Extended 
Data Fig. 1c). Next, we asked whether LECA families predominantly 
present in distinct subcellular compartments showed differences in 
terms of phylogenetic origins and stem lengths. Consistent with the 
above results, nuclear protein families had the longest stems, followed 
by those involved in the endomembrane system, and finally mito- 
chondrial proteins tended to have the shortest stems (Fig. 2d and 
Extended Data Fig. 1d). The fact that both function and evolution- 
ary origin correlate with stem length raises the need to disentangle 
the contribution of each of these factors. Our normalization assumes 
proportional (not necessarily constant) evolutionary rates in branches 
preceding and post-dating LECA, which both correspond to periods 
where the given protein had been incorporated into the host. Large 
shifts in evolutionary rates between the stem and post-LECA phases 
may have differentially impacted families depending on their function, 
leading to the observed differences mentioned above. However, our 
results are independent of the normalization, as shown in comparisons 
using the raw stem lengths (Fig. 2e, f). Furthermore, in matched com- 
parisons, families of similar function, selection pressure, number of 
protein-protein interactions or expression levels but different origins 
show differences in stem lengths (Supplementary Information sec- 
tion 1 and Extended Data Fig. 2). Thus, phylogenetic origin, and not 
function, is the main driver of observed differences in stem lengths. 
To independently validate our approach, we assessed the relative tim- 
ing of the acquisition of plastids, a type of organelle whose origin 
from cyanobacteria subsequent to mitochondrial endosymbiosis is 
uncontroverted. Consistently, cyanobacterial-derived families had 
significantly shorter stem lengths than alphaproteobacterial-derived 
families, thereby further supporting our approach (Supplementary 
Information section 2 and Extended Data Fig. 3). 
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Figure 2 | Phylogenetic distance profiles. a-d, Profiles of different 
prokaryotic sources (a, b), cellular functions (c) and cellular components 
(d). The lower and upper box limits in a, c and d correspond to the first 
and third quartiles (25th and 75th percentiles). a, Box plot comparing 
stem length distributions in LECA families with archaeal, non-alpha 
bacterial and alphaproteobacterial sister-groups. Numbers on the 

x axis indicate the number of families included in each class. Symbols 
indicate the P values obtained from a two-sided Mann-Whitney U-test 
for the indicated comparisons as follows: *P<5 x 1077; **P<1x 107%; 
*EKD <1 x 1073; HRP < 1 x 107°, b, The observed mean (jiops) stem 
length of alphaproteobacterial values compared with the random sampling 
distribution of means, under the null hypothesis that families of different 
bacterial origins do not show differences in stem lengths. The P value is 
the probability that the mean would be at least as extreme as the observed, 
if the null hypothesis were true. The dashed line and the shaded area under 
the density plot correspond to the one-sided P value of the test (indicated 
next to the figure). c, d, Box plots of stem length distributions in LECA 
families of different COG functional categories (c) and GO localizations 
(d), when considering all LECA families (All), or only those of bacterial 
descent (Bacterial). Other symbols as in a. e, f, The results obtained in 

a and b are consistent when using raw stem lengths, indicating that the 
relative differences in stem lengths are not driven by differences in the 
rates of evolution within extant eukaryotes (ebl). 


We next tested the robustness of our results with different 
LECA data sets, sequence sampling and phylogenetic methods 
(see Supplementary Information sections 3-5 and Extended Data 
Fig. 4-6). Additional controls (Supplementary Information sec- 
tions 4 and 5 and Extended Data Fig. 6) showed that HGT alone 
cannot explain the observed signal from non-alphaproteobacterial 
bacteria, and discarded the possibility that shorter stem lengths in 
alphaproteobacterial-derived families resulted only from specific 
functional classes, or from those affiliated to Rickettsiales, whose 
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specific clustering to mitochondrial proteins has been considered 
artefactual!?. Finally, we included data from the recently identified 
lokiarchaeon clade in our analysis'!. Even though we found that LECA 
families of inferred lokiarchaeal origin had stems larger than those of 
bacterial-derived families, they did show the shortest stems among 
archaeal-derived proteins, thereby providing additional support that 
there is a close association of this clade to eukaryotes (Supplementary 
Information section 6 and Extended Data Fig. 7). 

To gain further insight into the functionality and localization of the 
LECA families of different phylogenetic origins, we used correspond- 
ence analysis to visualize associations among these variables, and per- 
mutation tests to assess the statistical significance (see Methods, Fig. 3 
and Extended Data Fig. 8). We found that alphaproteobacterial-derived 
genes tend to associate with mitochondria (P< 10~°, permutation 
test with 10° randomizations), whereas archaeal-derived families 
do so with the nucleus. Perhaps more unexpectedly, we found that 
LECA families of bacterial descent, except for Alphaproteobacteria, 
showed a clearly distinct pattern, being predominantly associated with 
endomembrane related compartments (Fig. 3b and Extended Data 
Fig. 8b, d). Consistent results were obtained when correlations between 
evolutionary origins and functional categories were evaluated (Fig. 3a 
and Extended Data Fig. 8a, c). In particular, the alphaproteobacte- 
rial component showed a unique correlation with energy production 
(P <10~°). This result is not consistent with scenarios in which most 
of the bacterial components in LECA are assumed to originate from 
the alphaproteobacterial endosymbiont, because in this case a higher 
functional coherence would be expected among them. These results 
also reinforce the idea that, despite substantial subcellular re-targeting 
and functional diversification, the proto-mitochondrial-derived frac- 
tion of the eukaryotic proteome retains a tendency to be mitochondrial 
localized", Interestingly, alphaproteobacterial-derived families of mito- 
chondrial localization have shorter stem lengths than mitochondrial 
families of different origins (P=6 x 107%), which indicates re-targeting 
to the newly formed organelle. 

Altogether, our results provide compelling support for a late 
acquisition of mitochondria, as proposed by several eukaryogenesis 
models’. Specifically, our data suggest that most of the bacterial com- 
ponent of LECA, with origins other than alphaproteobacteria, was 
acquired earlier and mostly contributed to compartments other than 
the mitochondrion or the nucleus, and to processes besides energy 
production. We have shown that this pattern cannot be entirely 
explained by massive HGT to the proto-mitochondrial ancestor. 
This implies that these proteins were acquired by the host genome 
before mitochondrial acquisition. Thus, the host that engulfed the 
mitochondrion was already a complex cell, whose genome already 
harboured pathways and processes of diverse bacterial origins. Given 
the heterogeneity of these alternative bacterial origins, no simple 
model can explain this component. Serial symbiotic associations with 
different partners, the existence of prokaryotic consortia or gradual 
waves of HGT to the host genome before mitochondrial endosymbi- 
osis could all explain such chimaerism. Finally, the archaeal-derived 
component has the longest stems and the strongest association with 
the nucleus, consistent with the idea that eukaryotes have rooted 
from within archaea, and that the nucleus is of archaeal origin. Our 
results are compatible with either a complex proto-eukaryotic host 
or a complex archaeal host already harbouring many pathways of 
bacterial origin’. In either case, mitochondrial engulfment marked 
an end to massive bacterial HGT in LECA and the start of the diver- 
sification of extant eukaryotic lineages. We argue that mitochondrial 
endosymbiosis was indeed a crucial late step in eukaryogenesis, which 
brought about the definitive selective advantage that facilitated the 
dominance and radiation of the eukaryotic groups that have survived 
to the present day. 

Online Content Methods, along with any additional Extended Data display items and 


Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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Figure 3 | Correspondence of different LECA components with 
different cellular localizations and functions. a, b, Correspondence 
analysis symmetrical biplots showing differences between the localizations 
(a) and functions (b) of the families of various phylogenetic origins. 

In both cases, the first principal components, accounting for the largest 
percentage of variance explained, clearly separate the bacterial and 
archaeal (brown ellipse) eukaryotic origins, while the second components 
separate the alphaproteobacterial (red dot) from the other bacterial origins 
(cyan ellipse). The numbers next to the principal axes (PC1, PC2) show 
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METHODS 


No statistical methods were used to predetermine sample size. The investigators 
were not blinded to allocation during experiments and outcome assessment. 
Sequence data. The sequences of proteins encoded by 3,686 fully-sequenced 
genomes of eukaryotes (238), Bacteria (3,318) and Archaea (130), as well as the 
192,421 non-supervised orthologous groups (NOGs) and COGs corresponding 
to the broadest taxonomic level (last universal common ancestor, LUCA), were 
downloaded from eggNOG version 4.0 (ref. 16); hereafter NOGs/COGs will be 
referred to as orthologous groups, indistinctively. In total, 11,504 orthologous 
groups containing 4,323,066 sequences both from eukaryotic and from prokaryotic 
species were considered. For the analysis including the recently sequenced member 
of Lokiarchaeota!', the 5,384 protein coding sequences of the archaeon Loki were 
downloaded as of 7 May 2015 from the Protein database of NCBI (http://www.ncbi. 
nlm.nih.gov/protein/) under the taxonomy identifier 1538547. 
Taxonomy-based sequence sub-sampling. To reduce data redundancy and 
obtain a more balanced representation of different eukaryotic families, the initial 
data set was sub-sampled using taxonomic criteria. We selected 37 eukaryotic 
species, covering all main eukaryotic subdivisions present in EggNOG version 
4 (unikonts, Archaeplastida, Chromalveolates, Excavates), emphasizing model 
species for which better genomes with experimental annotations were available 
(Supplementary Table 1). The selected set comprises 18 unikonts (16 Opisthokonta 
and 2 Amoebozoa), 6 Archaeplastida (5 Viridiplantae and 1 Rhodophyta), 
8 Chromalveolates (5 Alveolata, 3 Stramenopiles) and 5 Excavates (2 Euglenozoa, 
1 Fornicata, 1 Parabasalia and 1 Heterolobosea). Similarly, for the prokaryotic 
genomes, we defined 692 levels based on taxonomic criteria. This set represents 
all 681 prokaryotic genera present in EggNOG version 4 and 11 groups in which 
the ‘genus’ rank is not assigned (‘no rank’). Genomes with non-informative taxo- 
nomic assignments, including the words ‘environmental and ‘unclassified, were 
not considered. For each of the orthologous groups, we randomly sampled one 
sequence from each of the 729 taxonomic levels defined (37 eukaryotic species 
plus 692 prokaryotic levels). 

Phylogenetic analysis and identification of LECA families. The detection of 
LECA families (that is, groups of related eukaryotic sequences that are inferred to 
be derived from LECA) was done in two steps. First, maximum likelihood trees 
were computed using a fast approach. For this we first built alignments of the 
8,188 filtered orthologous groups using MAFFT version 6.861b’” and the —auto 
parameter. These alignments were trimmed using trimAl version 1.4 (ref. 18) with 
a gap score cutoff of 0.01. Then, maximum likelihood phylogenetic trees were 
reconstructed using FastTree 2.1.7 (ref. 19) and the WAG evolutionary model 
(-wag). These trees were inspected to identify monophyletic groups of three or 
more eukaryotic sequences, corresponding to eukaryotic protein families. Similarly 
to previous studies®, eukaryotic sequences within one orthologous group were 
not considered a priori monophyletic, as the same group could comprise different 
eukaryotic groups derived from ancestral duplications subsequent to LUCA but 
preceding LECA (see also ref. 20). This resulted in the identification of multiple 
eukaryotic LECA families in some orthologous groups. 

In the subsequent step we performed a second phylogenetic analysis of the 
identified eukaryotic LECA families. For this we considered only the sequences in 
the given eukaryotic family and all the prokaryotic sequences in the tree, and used a 
more accurate phylogenetic approach. We used a similar pipeline to that described 
in ref. 21. In brief, multiple sequence alignments using three different aligners, 
MUSCLE version 3.8.31 (ref. 22), MAFFT version 6.861b!” and DIALIGN-TX 
1.0.2 (ref. 23), were performed in forward and reverse orientation. The six resulting 
alignments were combined with M-COFFEE version 8.80 (ref. 24) into a maximal- 
consensus alignment, which was trimmed using trimAl] version 1.4 (ref. 18) 
with a gap score cutoff of 0.01. For each sequence alignment, the best-fit evolu- 
tionary model selection was done before phylogenetic inference using ProtTest 
version 3 (ref. 25). In each case three different evolutionary models were tested 
(TT, WAG, LG). The model best fitting the data was determined by comparing 
the likelihood of all models according to the Akaike information criterion. Finally, 
an maximum likelihood tree was inferred with RAxML version 8.0.22 (ref. 26) 
using the best-fitting model and a discrete gamma-distribution model with four 
rate categories plus invariant positions. The gamma parameter and the fraction of 
invariant positions were estimated from the data. SH-like branch support values 
were computed using RAXxML version 8.0.22. Only the eukaryotic families whose 
monophyly was also recovered in this second phylogenetic step, and for which the 
support value of the branch between this clade and the prokaryotic sister clade 
was higher than 0.5, were further considered in the analysis. For the phylogenetic 
analysis, the execution of the different phylogenetic workflows was done using 
the bioinformatics tool ETE version 2.3 (ref. 27) as environment in the single gene 
tree execution mode. 

Detection of eukaryotic families present in LECA. Our workflow provided 
us with a flexible framework for evaluating the effect on the final outcome of 
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different definitions of LECA. Results using alternative criteria are discussed in 
Supplementary Information section 1. Similarly to previous analyses®, a eukary- 
otic family was inferred as being derived from LECA on the basis of its presence 
in different major eukaryotic groups. In particular, the requisites for inclusion in 
LECA are similar to the one used in ref. 8, but with some important differences. 
For instance, the criteria used in ref. 8 could be met by genes present only in 
Archaeplastida and Chromalveolates, a pattern that suggests genes are acquired 
in Chromalveolates through secondary endosymbiosis”™*. Our criteria required the 
presence of sequences both from unikonts and from at least one of the other groups 
among bikonts (Archaeplastida, Chromalveolates and Excavates; see also Extended 
Data Fig. 4a). This procedure rendered 1,078 families, 433 of which were present 
in all 4 groups and 323 in at least three groups, including unikonts. Upon using 
more stringent definitions our main results were not affected, but the number of 
families that could be selected for analysis was significantly reduced (see Extended 
Data Fig. 4b and Supplementary Information section 3.1). 

Inference of the prokaryotic sister group and phylogenetic origin. We used 
a nearest neighbour approach for estimating the prokaryotic affiliation of each 
LECA group (see Fig. 1a). For that, the phylogenetic trees were first rooted to the 
prokaryotic sequence that was most distant from the eukaryotic LECA family. 
Then, the phylogenetic origin of each LECA family was assigned by evaluating the 
prokaryotic species present in the sister tree partition and using the NCBI taxon- 
omy to define the narrowest taxonomic level that included all prokaryotic species 
present in that partition. For instance, if only sequences from Alphaproteobacteria 
and Betaproteobacteria were present in the sister branch, the inferred origin would 
be ‘proteobacteria’ If sequences from any bacterial group(s) were present together 
with sequences from any archaeal group(s), the group of origin would be con- 
sidered ‘cellular organisms’ and so on. Given the hierarchical structure of NCBI 
taxonomy, this assignment inherited all parent taxonomic levels included within 
it. For example, a LECA family with an inferred origin in Rickettsiales was also 
assigned alphaproteobacterial, proteobacterial and bacterial origins. 
Measurement of the phylogenetic distance to the last common prokaryotic 
ancestor of LECA families: stem lengths. The branch of interest of each gene 
family tree is the one connecting the last common ancestor of the LECA family 
with the common ancestor of this and the nearest prokaryotic sister group to LECA 
(stem, see Fig. 1a). The length of this branch corresponds to the expected number 
of substitutions per site in that lineage: that is, the amount of change from the 
incorporation of the gene into the eukaryotic lineage until LECA. As this also 
depends on the evolutionary rate of each gene, we normalized the stem length 
value by dividing it by the median of the branch lengths within the LECA family. 
We chose the median because of its robustness with respect to extreme outliers 
(very long branches resulting from fast evolving sequences or phylogenetic arte- 
facts). In the text we refer to this corrected branch length value as stem length. Our 
rationale for this correction is the following: across families, the time of divergence 
from LECA is, by definition, the same. Therefore, differences in eukaryotic branch 
lengths across families are expected to reflect differences in evolutionary rates. 
By applying this correction, we thus divide by a constant (time from LECA) and 
a rate, which varies from family to family. This can schematically be expressed by 
the following relationship: 


stem length = = a 


e e 


where R,, T; and R,, T. are the evolutionary rate (R) and divergence time (T) of the 
stem (s) and the eukaryotic clade (e), respectively. Under the assumption that rates 
pre- and post-LECA are correlated (that is, not necessarily constant), this normal- 
ization compensates for differences in rates in the pre-LECA branches, providing 
a closer measurement of the divergence time from the prokaryotic ancestor to the 
LECA. Although we cannot discard that major rate shifts in pre- and post-LECA 
branches occurred in some cases, we consider it unlikely that they affected in a 
similar way all proteins of the same phylogenetic origin, regardless of their func- 
tion; or that they affected in an opposite way proteins with similar function but of 
different phylogenetic origin. Nevertheless, we performed comparisons using the 
raw stem lengths as well as with the (normalized) stem lengths. 

LECA family descriptors. Each LECA family was assigned a phylogenetic 
origin and a normalized stem length. In addition, they also received the functional 
(COG functional categories) and GO annotations provided by the eggNOG data- 
base. Annotations included functional categories of the corresponding orthologous 
groups as defined in the COG database, as well as GO cellular component annota- 
tions, of which we only considered terms that had experimental evidence codes and 
that were present in the GO slim generic cut-down version of the GO ontologies. 
After testing alternative thresholds, GO terms were assigned to the correspond- 
ing families if they were present in more than 10% of the sequences in the family 
considered. For the correspondence analysis (see below), where very rare terms 
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could bias the statistical inference, we used a stricter approach, considering only 
GO slim terms that were assigned to sequences from more than one group among 
unikonts, Archaeplastida, Chromalveolates and Excavates. Finally, through the 
corresponding orthologous groups, COG functional categories and GO slim terms 
were linked to prokaryotic groups and stem lengths, which were later used for 
profile comparison (Fig. 2c, d and Extended Data Fig. 1c, d). For convenience, we 
list here the COG functional categories corresponding to the one-letter codes: A, 
RNA processing and modification; B, chromatin structure and dynamics; C, 
energy production and conversion; D, cell cycle control and mitosis; E, amino-acid 
metabolism and transport; F, nucleotide metabolism and transport; G, carbohy- 
drate metabolism and transport; H, coenzyme metabolism; I, lipid metabolism; 
J, translation; K, transcription; L, replication and repair; M, cell wall/membrane/ 
envelop biogenesis; N, cell motility; O, post-translational modification, protein 
turnover, chaperone functions; P, inorganic ion transport and metabolism; Q, 
secondary structure; T, signal transduction; U, intracellular trafficking and secre- 
tion; Y, nuclear structure; Z, cytoskeleton; R, general functional prediction only; 
S, function unknown. 

Unsupervised clustering and enrichment analyses. The clustering of the stem 
lengths into different components was done by fitting a Gaussian mixture model 
using the expectation-maximization algorithm as implemented in the Mclust pack- 
age”’ in R. The Mclust function returns the optimal model—the optimal number 
of components and membership—according to a maximum likelihood estimation 
and the Bayesian information criterion for expectation-maximization, initialized 
by hierarchical clustering for parameterized Gaussian mixture models. Applying 
the algorithm to the distribution of the normalized stem lengths from the LECA 
inference clustered the data into five components/subpopulations, of which the 
fifth, with only 14 extreme observations (with values in the range 2.3-7.1), also 
enriched in archaeal origins, was considered an outlier and was excluded. Each of 
the 1,064 remaining LECA families was assigned a membership within the four 
remaining components. Each of these subgroups was tested for enrichment in 
prokaryotic groups of origin, COG functional categories and GO cellular com- 
ponent terms. Enrichments were calculated using 10° permutations, in which the 
family memberships were randomly reshuffled and the P values estimated as the 
number of times a given origin, COG category or GO term had a count in the given 
component equal or greater than the observed one (Table 1). 

Statistical comparisons of stem lengths. The statistical significance of the 
observed differences in normalized stem lengths between the different groups 
(taxonomic groups or functional categories and GO terms) was assessed with a 
non-parametric two-sided Mann-Whitney U-test for pairwise, or among three, 
comparisons. In the case of comparisons among three groups, the P values were 
adjusted for multiple testing with a correction for false discovery rate using the 
p.adjust function in R. The significance of the observed difference between the 
normalized stem lengths associated with the various groups and the overall bacte- 
rial signal was assessed using a permutation test with 10° randomizations. In each 
round, by sampling the whole distribution, the values were randomly assigned to 
the various eukaryotic families, and the mean, resulting from the random sampling 
of each of the groups, was computed (every group in each round had the same size 
but random values). The P value for each group was calculated as the number of 
times that an equal or more-extreme mean than the observed was occurring by 
chance, divided by the overall number of randomizations. 


Statistical associations. We used a permutation test (10° permutations) to 
evaluate the relationships between the proteins’ evolutionary origin and their 
function/subcellular localization. The observed association was estimated as the 
number of co-occurrences between a given term and a given prokaryotic group 
of origin throughout all the families. The P value was calculated as the number 
of times that the amount of random co-occurrences between a group-term pair 
was equal or higher than the observed, divided by the number of permutations. 
Correspondence analysis is a statistical multivariate technique, conceptually 
similar to principal component analysis, that has been widely used to visualize 
associations between categorical variables*’. Briefly, it decomposes the \” statistic 
associated with the two-way table into orthogonal factors that maximize the sep- 
aration between row and column scores. Correspondence analysis was applied to 
the contingency table of co-occurrences between the inferred taxonomic groups 
of prokaryotic origins (rows) and the various annotation terms (columns). The 
biplots in Fig. 3 and Extended Data Fig. 8a, b show the best two-dimensional 
approximation (first two principal axes) of the distances between rows and col- 
umns in each case. For the computation we used the ca function of the ca package 
in R, after removing very rare observations (single observation columns) that 
could bias the representation. 
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Extended Data Figure 1 | See next page for figure caption. 
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Extended Data Figure 1 | Sister group distribution and extended 
phylogenetic distance profiles. a, Ring plot showing the distribution of 
inferred prokaryotic origins. Inner layers represent hierarchically lower 
(broader) taxonomic levels. The number of LECA families assigned to 
each group is indicated in parentheses next to the corresponding level in 
the ring plot or in the boxes below. b, Box plot showing the distributions 
of branch lengths in the different bacterial components. Measured stem 
lengths (sl), raw stem lengths (rsl), and the medians of the lengths from 
LECA to branch tips inside the eukaryotic families (ebl), as defined in 
Fig. 1a, are shown. Permutation tests were performed to evaluate the 
statistical significance of the differences between the distributions. 

A total of 10° permutations were performed, with the values being 
randomly shuffled in each permutation (see also Methods). The arrows 


and symbols above the boxes refer to the statistical significance of the 
differences observed compared with randomly shuffled distributions 
(lower values, downward red arrow; higher values, upward green 
arrow). The correspondence between the symbols and the P values is as 
follows: ~P<1 x 1071}; *P<5 x 1077; **P<1 x 107%; ***P<1 x 1073; 
eek D < 1 x 10~ ©, The lower and upper box limits correspond to 

the first and third quartiles (the 25th and 75th percentiles). c, d, Stem 
length profiles of the various functional categories (c) and GO slim 
cellular components (d) are shown. As in Fig. 2c, the stem lengths are 
also evaluated by looking only at the bacterial component to exclude 
the possibility that the observed differences are due solely to archaeal- 
bacterial differences. The significance was assessed with permutation tests 
(10° permutations) and is indicated with arrows as in b. 
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Extended Data Figure 2 | Families of archaeal origin have significantly 
longer stems than families of bacterial origin across different 
functional categories, similar selective pressures, and connectivities/ 
expression levels. a, The stem lengths, raw stem lengths, and eukaryotic 
branch lengths, between families of archaeal and bacterial inferred origin, 
are compared across the three major functional categories. While the 
eukaryotic branch lengths among the groups do not show significant 
differences, differences are detected in their respective stems (raw stem 
lengths and stem lengths). b, Archaeal and bacterial LECA families of 
similar selective pressures (as measured by dN/dS values across family 
members) differ significantly in terms of their raw stem lengths. Sets of 
families from both groups were matched with respect to their dN/dS values 
in the indicated reference species. The dN/dS data were downloaded from 


Ensembl for family members corresponding to Homo sapiens (Metazoa), 
Aspergillus nidulans (fungi) and Zea mays (plants) (see Supplementary 
Information section 1). The comparison of the raw stem lengths of the 
two sets shows that archaeal families generally have significantly longer 
stems (upper plots), and functions within the ‘information storage and 
processing’ category (lower plots), irrespective of their selective pressures. 
c, Archaeal and bacterial LECA families of similar connectivity/expression 
levels show significantly different raw stem lengths (see Supplementary 
Information section 1). In a—c, differences between the archaeal and 
bacterial component were evaluated with a two-tailed Mann-Whitney 
U-test and the P value is indicated in each case (*P<5 x 1077; 

~P<1x 1075 #P>1). 
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Extended Data Figure 3 | Analysis of the cyanobacterial signal (post-LECA). d, Overall, as with mitochondrial localized proteins, those 
in primary plastid-bearing eukaryotes. a, Ring plot showing the proteins localized to plastids have shorter stems than the nuclear and 
distribution of inferred prokaryotic origins in widespread plant protein endomembrane system proteins. e, Schematic representation of the 
families, as in Extended Data Fig. la. The profile of inferred origins expected difference in stems, given that cyanobacterial endosymbiosis 
of eukaryotes that acquired a plastid through primary endosymbiosis occurred after the diversification of the major eukaryotic lineages. As 
carries a strong signal from the cyanobacterial endosymbiont. confirmed, the raw stem lengths measured from plant protein families to 
b, c, Families of inferred cyanobacterial origin have significantly shorter their common ancestor with cyanobacteria are shorter than those whose 
stem lengths and raw stem lengths than alphaproteobacterial families origin can be traced back to Alphaproteobacteria or other bacterial groups. 
(b) and than the random distribution of stem lengths from the bacteria Two-tailed Mann-Whitney U-test P value symbols in b and d are as in 
inferred component (c), pointing to a more recent acquisition of plastids Extended Data Fig. 1; additionally ****P <1 x 1074; *****P<1x 10. 
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Extended Data Figure 4 | Effect of alternative LECA definitions. and differences in stem lengths between proteins of alphaproteobacterial 
a, The four eukaryotic groups including all 37 selected eukaryotic species origins and those derived from other bacteria. Numbers in parenthesis 
used in the analysis are shown next to the NCBI taxonomic structure, indicate the total number of LECA families that passed the threshold. The 
with the higher groupings modified according to the Tree of Life Project kernel density plots, as in Fig. 2b, show the observed stem length means 
(http://tolweb.org/Eukaryotes/3). b, Stricter LECA definitions have a for Alphaproteobacteria compared with 10° random samplings among 
much larger effect on the bacterial component than on the archaeal values in protein families of bacterial origin. The observed means (tops) 
component, which is more widespread among eukaryotic groups. c, The are shown with a dashed red line, reflecting the P value of each test, and 
effect of different LECA definitions in terms of taxonomic assignments indicated next to the plot. See also Supplementary Information section 3.1. 
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Extended Data Figure 5 | Alphaproteobacterial-derived proteins 

have consistently shorter branches, irrespective of the methods, data 
sets, and support thresholds. Kernel density plots of the random mean 
distributions of the stem lengths are shown for the different methods, data 
sets and support thresholds used (see also Supplementary Information 
sections 3.2 and 3.3). The observed alphaproteobacterial means ({/ops) are 
as in Fig. 2b. a, Results after using either the phylogenetic trees provided 
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by the authors in ref. 8 (upper left), our standard phylogenetic pipeline 
applied to their sampling of sequences (upper right) or alternative 
phylogenetic pipelines or samplings from EggNOG (lower). b, The main 
result is robust against progressively stricter support thresholds until the 
sample size becomes too small (support threshold > 0.9). Numbers in 
parenthesis indicate the number of bacteria-inferred LECA families for 
each threshold. 
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Extended Data Figure 6 | Evaluation of alternative HGT scenarios and 
other potential biases. a, The sampling effect was simulated by artificially 
removing part or all of the alphaproteobacterial sequences in the final 
data sets. To simulate the potential bias caused by an enriched sampling 
of Alphaproteobacteria, an artificial reduction of alphaproteobacterial 
sequences to 50% was applied to the data set (“HALF alpha sampling’). 
The reduction of alphaproteobacterial sequences by 50% does not 
significantly change the inferred stem length within families of 
alphaproteobacterial origin. #Cases where the difference was not 
significant. b, Different scenarios of HGT to the proto-mitochondrion are 
unable to explain the observed signal in families mapped to non-alpha 
Bacteria. The transfer of a gene from Alphaproteobacteria to another 
bacterial lineage after mitochondrial endosymbiosis and its parallel loss 
from the lineage of the mitochondrial ancestor (‘post-mito HGT 

from alpha’) would result in unchanged stem lengths. Loss of a gene 

from the alphaproteobacterial sister clade would result in an increase of 
the inferred stem lengths (‘vertical transmission/pre-mito HGT from 
alpha’). The transfer of a gene from the protoeukaryotic lineage to other 
bacterial clades would result in shorter stem lengths compared with the 


alphaproteobacterial mappings (‘post-mito HGT from protoeukaryote). 
c, Upon total exclusion of alphaproteobacterial sequences (‘NO alpha 
sampling’), eukaryotic families map to other bacterial groups but 

with stem length higher than those observed typically. The same is 
observed when comparing the stem lengths of the families mapping 

to proteobacterial groups in the absence of Alphaproteobacteria 

with those typically mapping to proteobacterial groups other than 
Alphaproteobacteria. d, Box plots showing that there are no significant 
differences in the stem lengths between alphaproteobacterial families with 
mitochondrial localization compared with those with other subcellular 
localizations (left), or between families involved in energy-related 
functions compared with those involved in other functional categories 
(right). e, Box plot showing no significant difference between the 
distribution of stem lengths of families of Rickettsiales-inferred origin and 
other Alphaproteobacteria. f, Alphaproteobacterial families in different 
functional categories show no difference in stem lengths. In all cases the 
distributions were compared using a two-sided Mann-Whitney U-test. See 
also Supplementary Information sections 4 and 5. 
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Extended Data Figure 7 | LECA inference and Lokiarchaeota. Results 
after the inclusion of Lokiarchaeota in our analysis. a, The distribution of 
the sister group inference among prokaryotic taxonomy is shown in a ring 
plot together with the number of families in each group in parentheses 

(as in Extended Data Fig. 1). b, Box plot showing the stem length profiles 
of the various prokaryotic groups. Lokiarchaeota show the lowest values 
among all archaeal groups but higher values than any bacterial group. 

The symbols correspond to the same P values explained in Extended Data 
Fig. 1 after applying a permutation test (10° permutations) for the archaeal 
and bacterial components, independently. c, Box plot with the comparison 
between the non-Loki archaeal, the Lokiarchaeota and the bacterial 


Other Archaea 


30 665 
Lokiarchaeota Bacteria 
stem length profiles. The P value symbols are as before (two-sided 
Mann-Whitney U-test, correction for false discovery rate). d, Schematic 
representation of the effect of the absence of Lokiarchaeum sequences on 
the stem lengths. The inferred origin of 30 eukaryotic families that were 
previously mapped to other, mainly archaeal, groups within the eggNOG 
version 4 database, is Lokiarchaeota, when homologous sequences from 
this metagenome are included. A reduction in the observed stem lengths 
of the families of Lokiarchaeota-inferred origin is expected in the scenario 
of Lokiarchaeota being the closest known archaeal relative of Eukaryotes. 
See also Supplementary Information section 6. 
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Extended Data Figure 8 | Correspondence of different LECA components _ parentheses. c, d, The contingency tables also used in correspondence 


with different cellular localizations and functions (extended version analysis are shown in the form of a heatmap. The asterisks in the different 
of Fig. 3). a-d, Different LECA components have different GO cellular cells reflect the significance of the association between a given origin anda 
components (a, c) and functional (b, d) profiles. Genes of different origin localization (c) or function (d), as computed using permutation tests 

tend to have different functions and subcellular localizations. a, b, The (10° permutations), where the annotations among each eukaryotic family 
same correspondence analysis symmetrical biplots as in Fig. 3 in higher were reshuffled (see Methods). The correspondence between the symbols 
resolution, with the names of the taxonomic group, the function and the and the P values is as in Extended Data Figs 1 and 3. e, The COG functional 
GO slim terms indicated next to the coordinates. The percentage of variance —_ categories, as organized in the three major groups ‘information storage and 
explained by each principal component is indicated next to each axis in processing; ‘cellular processes and signalling’ and ‘metabolism. 
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Deriving human ENS lineages for cell therapy and 
drug discovery in Hirschsprung disease 


Faranak Fattahi!, Julius A Steinbeck!, Sonja Kriks!*, Jason Tchieu!, Bastian Zimmer! , Sarah Kishinevsky 


1,2,3 
’ 


Nadja Zeltner!?, Yvonne Mica!*+, Wael El-Nachef*, Huiyong Zhao’, Elisa de Stanchina*, Michael D. Gershon‘, 


Tracy C. Grikscheit®, Shuibing Chen’ & Lorenz Studer)? 


The enteric nervous system (ENS) is the largest component of the 
autonomic nervous system, with neuron numbers surpassing those 
present in the spinal cord'. The ENS has been called the ‘second 
brain’! given its autonomy, remarkable neurotransmitter diversity 
and complex cytoarchitecture. Defects in ENS development are 
responsible for many human disorders including Hirschsprung 
disease (HSCR). HSCR is caused by the developmental failure 
of ENS progenitors to migrate into the gastrointestinal tract, 
particularly the distal colon’. Human ENS development remains 
poorly understood owing to the lack of an easily accessible model 
system. Here we demonstrate the efficient derivation and isolation 
of ENS progenitors from human pluripotent stem (PS) cells, and 
their further differentiation into functional enteric neurons. ENS 
precursors derived in vitro are capable of targeted migration in the 
developing chick embryo and extensive colonization of the adult 
mouse colon. The in vivo engraftment and migration of human 
PS-cell-derived ENS precursors rescue disease-related mortality 
in HSCR mice (Ednrb*"*"), although the mechanism of action 
remains unclear. Finally, EDNRB-null mutant ENS precursors 
enable modelling of HSCR-related migration defects, and the 
identification of pepstatin A as a candidate therapeutic target. Our 
study establishes the first, to our knowledge, human PS-cell-based 
platform for the study of human ENS development, and presents 
cell- and drug-based strategies for the treatment of HSCR. 

The in vitro derivation of human ENS lineages from human PS cells 
has remained elusive despite the great medical needs. The ENS develops 
from both the vagal and sacral neural crest (NC). We focused on the 
vagal NC that generates most of the ENS and migrates caudally to 
colonize the entire length of the bowel”. Current NC differentiation 
protocols? result in SOX10* NC precursors that are HOX negative 
(indicative of anterior/cranial identity; cranial neural crests (CNCs)). 
By contrast, vagal NC identity is characterized by the expression of 
specific, regionally restricted HOX genes including Hoxb3 (ref. 4) 
and Hoxb5 (ref. 5). Retinoic acid has been used previously as an 
extrinsic factor to shift the regional identity of central nervous sys- 
tem (CNS) precursors from anterior to more caudal fates during 
motor neuron specification®. Here we tested whether the addition of 
retinoic acid can similarly direct the regional identity in NC lineages 
and induce the expression of vagal markers (Fig. 1a). After treatment 
of SOX10-green fluorescent protein (SOX10-GFP) ES cells (ref. 3) 
with retinoic acid, we obtained GFP* cells with yields comparable to 
CNC conditions, indicating that retinoic acid treatment does not 
interfere with overall NC induction (Fig. 1b and Extended Data 
Fig. 1a, b). To facilitate the isolation of pure NC populations, we per- 
formed a candidate surface marker screen and identified CD49D 
(a4 integrin) as an epitope that reliably marks early SOX10* NC 
lineages (Fig. 1b and Extended Data Fig. la—c). We next used CD49D 


to demonstrate the robustness of retinoic-acid-based NC induction 
across human PS cell lines (both human embryonic stem (ES) cells 
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Figure 1 | Deriving ENC precursors from human ES cells. a, Protocol 
(days 0-11) for deriving enteric NC (ENC) cells. CHIR, CHIR99021; KSR, 
knockout serum replacement; LDN/SB, medium containing LDN193189 
and $B431542; RA, retinoic acid. b, Flow cytometry of ENC for SOX10- 
GFP and CD49D at day 11. c, Quantitative reverse transcriptase PCR 
(qRT-PCR) for SOX10, vagal NC markers HOXB2-5 and HOXB9 in 
CD49D* ENC versus CNC cells; 1 = 3 independent experiments. FC, fold 
change. d, Quantification of PAX3, RET and EDNRB immunofluorescence 
in CD49D* ENC lineage; n = 3 independent experiments. e, Unsupervised 
clustering of CD49D* NC versus matched CNS precursor (day 11). 

f, Top 10 and selected additional (bold) differentially expressed transcripts 
in CD49D* ENC versus CNS precursors. Clorf192 is also known as 
CFAP126; RP4-792G4.2 is an antisense transcript to FOXD3; RP11- 

200A 13.2 is a pseudogene. g, RFP* and CD49D* ENC precursors are 
FACS-purified (day 11) for transplantation into developing chick embryos. 
h, Whole-mount epifluorescence showing distribution of RFP* cells 24h 
after injection. i, Cross section of the embryos at trunk levels shows RFP* 
cells located in the gut anlage (left) and at higher magnification (right). 
Scale bars, 200 1m (h) and 101m (i). Data are mean + s.e.m. NS, not 
significant. ***P < 0.001 (t-test, ENC compared to CNC; n=3). 
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Figure 2 | Differentiation of human ES-cell-derived ENC precursors 
into enteric neuron subtypes. a, Protocol for neuronal differentiation and 
maturation of ENC precursors. AA, ascorbic acid; NB/N2/B27, neurobasal 
medium with N2 and B27 supplement. b, SOX10-GFP-expressing 3D 
spheroids from purified ENC precursors gave rise to TUJ1 and PHOX2A 
enteric neuron lineage. c, d, Phase-contrast and immunofluorescence 
images (c) and quantification (d) at day 40. ENC-derived cells express 
TRKC, ASCL1 and PHOX2A/B. PHOX2B expression was confirmed 
using H9 human ES-cell-based PHOX2B-GFP reporter line; n =3 
independent experiments. e, f, Immunofluorescence analysis (e) and 
quantification (f) for expression of diverse neurotransmitters. Cells were 
derived from FACS-purified, CD49D* ENCs to ensure NC origin; n =3 
independent experiments. SST, somatostatin; TH, tyrosine hydroxylase. 

g, Light-stimulated activation of ENC-derived neurons expressing 
channelrhodopsin-2 (ChR2). h, Phase-contrast and live fluorescence 
images of human ES-cell-derived smooth muscle cells (SMCs) and ENC- 
derived neuron co-cultures subjected to light stimulation. i, Diagram 
representing extent of contraction of SMCs before and during light 

(450 nm) stimulation at increasing frequencies. Scale bars, 100 |1m (b) and 
50m (c, e, h). Data are mean +s.e.m. 
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and induced pluripotent stem (iPS) cells; Extended Data Fig. 1d). 
Purified CD49D* NC precursors, derived in the presence of retinoic 
acid, expressed HOXB2-HOXBS5 indicative of vagal identity**, but not 
more caudal HOX transcripts such as HOXB9 (Fig. 1c). In further agree- 
ment with vagal identity, CD49D*, retinoic-acid-treated NC precursors 
expressed markers of early enteric NC (ENC) lineages” including PAX3, 
EDNRB and RET (Fig. 1d and Extended Data Fig. le, f). Given the 
paucity of developmental data on human ENC development, we per- 
formed RNA sequencing (RNA-seq) analysis in ES-cell-derived ENC 
precursors, in CNCs (no retinoic acid), in melanocyte-biased? NCs 
(MNCs) (Extended Data Fig. 1a), and in stage-matched CNS precur- 
sors’. Unsupervised clustering reliably segregated the transcriptomes 
of all PS-cell-derived NC populations away from CNS precursors and 
further subdivided the various NC sublineages (Fig. le). The most 
differentially expressed genes in the ENC compared to CNS lineage 
included general NC markers such as FOXD3 or TFAP2A but also 
PAX3 and HOX genes related to the ENC lineage (Fig. 1f). CNCs 
and MNCs were also enriched in general NC markers but showed 
high levels of NEUROGI, ISL1 or MLANA, TYRand DCT expression 
respectively, compatible with their subtype identity (Extended Data 
Fig. 1g, h). Direct comparison of the various NC lineages yielded novel 
candidate marker of human vagal NC/ENC lineage (Fig. 1f). A list 
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of the top 200 enriched transcripts for each NC lineage is provided 
(Supplementary Tables 1-3). 

One key functional property of the ENC is the ability to migrate 
extensively and to colonize the gut. Red fluorescent protein (RFP)- 
labelled, CD49D*-purified (Fig. 1g) PS-cell-derived ENC precursors 
were injected into the developing chick embryo at the level of the vagal 
NC. Transplanted human cells migrated along the trunk of the embryo 
(Fig. 1h) and colonized the gut (22 out of 57 embryos injected; Fig. 1i). 
By contrast, stage-matched CNC or MNC precursors targeted cra- 
nial regions (CNC) or followed a trajectory along the dermis (MNC) 
(Extended Data Fig. 1i). 

To address whether ES-cell-derived ENC precursors are capable of 
recreating ENS neuronal diversity, we maintained purified CD49D* 
ENC precursors in 3D spheroids for 4 days followed by differentiation 
as adherent cultures in the presence of ascorbic acid and glial-cell-line- 
derived neurotrophic factor (GDNF) (Fig. 2a). The 3D spheroid 
step was required to retain high levels of SOX10-GFP expression 
(Fig. 2b). Replating of 3D spheroids under differentiation conditions 
yielded immature neurons expressing TUJ1 and the enteric precursor 
marker PHOX2A (day 20; Fig. 2b). Most PHOX2A* cells were posi- 
tive for TRKC (also known as NTRK3), a surface marker expressed 
in enteric neuron precursors® suitable for enrichment for PHOX2A* 
and ASCL1* precursors (Extended Data Fig. 2a, b). Temporal express- 
ion analyses (Extended Data Fig. 2c-e) showed maintenance of ENC 
neuronal precursor marker expression by day 40 of differentiation 
(Fig. 2c, d), followed by an increase in the percentage of mature neu- 
rons by day 60 (Fig. 2e, f). In agreement with enteric neuron identity, 
we observed a broad range of neurotransmitter phenotypes includ- 
ing serotonin-positive (5-hydroxytryptamine*, 5-HT*), GABA™ (1- 
aminobutyric-acid-positive) and nitric-oxide-synthase-positive 
(NOS)* neurons. The presence of these neurotransmitters in neurons 
derived from CD49D*-purified NC precursors indicates ENC origin, 
since those neurotransmitters are not expressed in other NC lineages. 
Indeed, no 5-HT* neurons were observed in parallel cultures derived 
from HOX-negative, CD49D¢* cells (Extended Data Fig. 3a, b). CNC- 
derived precursors differentiated into tyrosine-hydroxylase-expressing 
neurons (Extended Data Fig. 3c), and gave rise to TRKB-positive rather 
than TRKC-positive precursors, suggesting enrichment for sympathetic 
neuron lineages (Extended Data Fig. 3d, e). 

A major function of the ENS is the control of peristaltic gut move- 
ments. We probed the functionality of in-vitro-derived enteric neurons 
by assessing connectivity with smooth muscle cells (SMCs). ES- 
cell-derived SMCs were generated via a mesoderm intermediate 
after exposure to activin A and BMP4 in vitro? and culture in the 
presence of TGFG (Extended Data Fig. 4a). The resulting SMC pro- 
genitors expressed ISL1 and were immunoreactive for smooth mus- 
cle actin (SMA) (Extended Data Fig. 4b). For connectivity studies, 
an optogenetic reporter line (ref. 10) was used to allow for light- 
induced control of neuronal activity. Enteric neurons were derived 
from a human ES cell line expressing enhanced yellow fluorescent 
protein (eYFP)-tagged channelrhodopsin-2 (ChR2) under con- 
trol of the human synapsin (SYN) promoter (Fig. 2g). GABA* and 
5-HT* neurons in these co-cultures were closely associated with 
SMCs (Extended Data Fig. 4c). Interestingly, co-culture of day-25 
neurons with SMCs triggered accelerated neuronal maturation, as 
illustrated by the increased expression of SYN-eYFP (Extended 
Data Fig. 4d). Conversely, ES-cell-derived SMCs also showed signs 
of accelerated maturation under co-culture conditions as illustrated 
by the expression of mature markers (MYH11 and acetylcholine 
receptor (AchR); Extended Data Fig. 4e) and the ability to contract 
in response to pharmacological stimulation (Supplementary Videos 
1-6 and Extended Data Fig. 4f). While no spontaneous contractions 
were observed under co-culture conditions, a wave of SMC contrac- 
tions could be triggered 5-10s after light-mediated activation (10 Hz 
frequency) of SYN-ChR2-eYFP neurons (Fig. 2h, i and Supplementary 
Video 7). Notably, both light- and drug-induced SMC contractions 
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Figure 3 | Human ENC precursors migrate extensively in normal and 
HSCR adult colon. a, Transplantation model into the wall of proximal 
colon (caecum) of adult mice. b, Whole-mount fluorescence imaging of 
grafted RFP* human ES-cell-derived ENC precursors inside adult colon 
wall (NSG mice). ¢, Cross section of NSG colon showing robust survival 
and TUJ1 staining of grafted cells. d, Survival curve of Ednrb*"*"! (HSCR) 
animals grafted with human ES-cell-derived ENC precursors versus 
Matrigel-only. e, Immunofluorescence staining of normal or HSCR colons 
transplanted with Matrigel (control) or RFP* human ES-cell-derived 
ENC precursors showing distribution of human cells in myenteric and 
submucosal layers. Dashed lines indicate border between submucosal and 
muscle layers. WT, wild type. f, Representative images of grafted animals 
3 months after transplantation into Ednrb*"*" colon, showing expression 
of GABA and 5-HT in SC121 human cells. All in vivo experiments were 
performed in a blinded manner. Ednrb*"*" mutation was confirmed in all 
grafted animals; n = 6 animals each for treatment and control groups. Scale 
bars, 1 cm (b), 500 1m (c, left); 100 1m (¢, right, e) and 25 \1m (f). P value 
for survival analysis is given numerically, log-rank (Mantel-Cox) test. 


were slow and involved the movement of sheet-like structures, sug- 
gesting coordination among cells possibly via gap-junction-mediated 
coupling. These studies demonstrate functional connectivity between ES- 
cell-derived enteric neurons and SMCs. In vivo interactions of the ENS 
within the gut, however, are more complex and involve several cell types. 
As a first step in modelling those interactions in 3D, we used a tissue- 
engineering approach combining in-vitro-derived human ENC precur- 
sors (CD49D*, day 15) with mouse primary intestinal tissue (Extended 
Data Fig. 5a). Using our previously established protocols to form organoid 
units!!, the recombined tissue constructs were seeded onto a scaffold and 
implanted onto the omentum of immunodeficient hosts for maturation 
in vivo''. Human cells were readily detected within gut-like structures 
using the human-specific markers SC121 and synaptophysin. Importantly, 
cells were located both in epithelial and muscle layers (Extended Data 
Fig. 5b), showing their ability to interact with both target cell types. 
Transplantation of PS-cell-derived precursors could yield novel 
therapeutic opportunities for ENS disorders such as HSCR. Children 
with HSCR are currently treated by surgical removal of the aganglionic 
portion of the gut. While life-saving, the surgery does not address dys- 
function of the remaining gastrointestinal tract in surviving patients’. 
Furthermore, therapeutic options in patients with total aganglionosis 
are limited. A major challenge in developing a cell therapy for HSCR is 
the need to repopulate the ENS over extensive distances. Previous stud- 
ies tested the transplantation of a variety of candidate cell sources into 
the fetal or postnatal colon!*4, Mouse fetal-derived ENC precursors 
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resulted in the most promising data, with evidence of functional inte- 
gration but limited in vivo migration'». To assess the ability of human 
ES-cell-derived ENC precursors to migrate within the postnatal or 
adult colon (3-6 weeks of age), we performed orthotopic injections 
of CD49D* RFP-labelled precursors into NOD-SCID-Il2rg/ ~ (NSG) 
mice (Fig. 3a). Cells were injected into the wall of the caecum aiming 
for the muscle layer (Fig. 3a) and resulting in a well-defined deposit of 
RFP* cells 1 h after injection (Extended Data Fig. 6a, left, and Fig. 3b, 
top). Notably, 2-4 weeks after transplantation, RFP* cells had migrated 
extensively and repopulated the host colon over its entire length 
(Fig. 3b). The grafted ENC precursors formed clusters along the colon 
(Extended Data Fig. 6a, right) expressing TUJ1 (Fig. 3c). By contrast, 
stage-matched CNS and CNC precursors, grafted under identical 
conditions, showed limited migration (Extended Data Fig. 6b, c). 

Given the extraordinary ability of the grafted cells to repopulate the 
host colon, we tested whether those cells could provide therapeutic 
benefit in an HSCR model. One widely used genetic model is the 
Ednrbs"*! mouse'®. These mice develop a megacolon owing to aberrant 
peristalsis. As a consequence, mutant mice show high mortality by 4-6 
weeks of age. Ednrb*"*" mice were injected 2-3 weeks after birth with 
RFP* human ES-cell-derived ENC precursors (treatment group) or 
with Matrigel vehicle (control group). Most control-injected animals 
(n=6) died over a period of 4-5 weeks (Fig. 3d) with a megacolon- 
like pathology (Extended Data Fig. 6d). By contrast, all animals injected 
with ES-cell-derived ENC precursors (n =6) survived (Fig. 3d). 
Grafted animals were assessed for graft survival and migration at 
6-8 weeks of age. Whole-mount fluorescence imaging confirmed the 
migration of ES-cell-derived ENC precursors within the HSCR colon 
(Extended Data Fig. 6e, f). Preliminary studies showed a trend towards 
an improved gastrointestinal transit time in grafted versus the small 
subset of Matrigel-treated animals that survived beyond 6 weeks of 
treatment, as measured using carmine dye gavage (Extended Data 
Fig. 6g). Histological analyses at 6 weeks and 3 months after transplan- 
tation confirmed myenteric and submucosal localization of human cells 
in HSCR colon. While the location mimicked aspects of endogenous 
ENS, there was a preference towards the submucosal region (Fig. 3e 
and Extended Data Fig. 6h, top). Human cells were also detected in the 
distal colon (Extended Data Fig. 6h, bottom), where only few endog- 
enous TUJ1* cells were detected. Immunocytochemistry for SC121 
and human-specific synaptophysin, not detected in Matrigel-injected 
animals (Extended Data Fig. 6i), confirmed human identity and pre- 
synaptic marker expression (Fig. 3f, right). In addition to neuronal 
cells (Extended Data Fig. 6h), we also observed human cells expressing 
glial markers such as GFAP (Extended Data Fig. 6j). Neuron- 
subtype-specific markers in SC121 * human cells included 5-HT, GABA 
and NOS (Fig. 3f and Extended Data Fig. 6k, 1). 

Our findings demonstrate that wild-type human PS-cell-derived ENS 
precursors can repopulate the colon of HSCR mice. However, HSCR- 
causing mutations often affect the migration of ENS precursor in a cell 
autonomous manner”"’. Therefore, developing patient-matched cell 
therapies for HSCR may require complementary genetic or pharmaco- 
logical strategies to overcome intrinsic migration defects of the trans- 
planted cells. As causative genetic defects in HSCR patients are often 
not known or complex”, gene correction before transplantation may 
not be possible. We therefore assessed whether human PS-cell-derived 
ENS precursors can be used to model HSCR and serve as a platform 
to screen for candidate compounds that could overcome disease-re- 
lated migration defects. In a first step, we established isogenic ES cell 
lines with homozygous loss-of-function mutations in EDNRB using 
CRISPR/Cas-based gene-targeting techniques!” (Fig. 4a and Extended 
Data Fig. 7a, b). Loss-of-function mutations in EDNRB are a well- 
known genetic cause in a subset of HSCR patients””. ENC precursors 
could be derived at comparable efficiencies from EDNRB-mutant and 
control lines (Extended Data Fig. 7c). However, CD49D* ENC precur- 
sors from four EDNRB ’~ clones showed a notable migration defect 
when using the scratch assay to model aspects of the HSCR phenotype”! 
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Figure 4 | EDNRB signalling regulates human ENC precursor cell 
migration. a, In vitro HSCR disease modelling model. b, c, Representative 
images and quantification of scratch assay in RFP*/CD49D* wild-type 
and EDNRB ‘~ human ES-cell-derived ENCs (clones (C) 1-4); n=3 
independent experiments. d, Illustration of chemical screen. e, Dose- 
response of pepstatin A on migration of CD49D+ EDNRB~/~ human 
ES-cell-derived ENCs. f, Quantification of CD49D* EDNRB~/~ human 
ES-cell-derived ENC migration after treatment with pepstatin A (101M) 
or BACE inhibitor (i.) IV (1M); n =3 independent experiments. 

g, Quantification of cell migration after BACE2 knockdown using a 

pool of five different short interfering RNAs (siRNAs) or four individual 
siRNAs; n= 3 independent experiments. h, Pepstatin A pre-transplantation 
treatment model. i, Whole-mount images of NSG colon transplanted with 
REP* CD49D purified wild-type and EDNRB~/~ human ES-cell-derived 
ENC precursors with or without pepstatin A pre-treatment. j, Quantification 
of the fraction of animals with human cells present in colon at increasing 
distance from injection site (see Extended Data Fig. 8c); n > 8 animals for 
each of the treatment groups. Scale bars, 200 1m (b) and 1 cm (i). Data are 
mean + s.e.m. **P< 0.01; ***P < 0.001; ****P < 0.0001 (¢, f and g; analysis 
of variance (ANOVA); Dunnett test (compared to wild type)). P values in j 
are given numerically, log-rank (Mantel-Cox) test; n > 8 for each group. 


(Fig. 4b, c). We next assessed whether EDNRB~’~ ENCs display major 
differences in cell proliferation or survival as compared to wild-type 
ENCs. At 24h after replating (time point used for scratch assay), we did 
not observe differences in either assay (Extended Data Fig. 7d, e). By 
72h, EDNRB~’~ ENCs showed reduced proliferation while cell viability 
remained unaffected (Extended Data Fig. 7d, e). 

We next carried out a small molecule screen to identify compounds 
capable of rescuing the migration defects observed in EDNRB ’~ ENC 
precursors (Fig. 4d and Extended Data Fig. 8a). We screened 1,280 
compounds (Prestwick Chemical Library) consisting of FDA-approved 
drugs. Results were binned into compounds with no effect, toxic effects, 
modest effects or strong effects (Fig. 4d and Extended Data Fig. 8a-c). 
The hits were further validated by repeating the scratch assay under 
non-high-throughput-screening conditions (Extended Data Fig. 8d). 
Compounds with strong effects were chosen for further follow-up. 
Among those validated compounds, we focused on the molecule pep- 
statin A, which showed a dose-dependent rescue of the migration defect 
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in EDNRB ‘~ ES-cell-derived NC precursors (Fig. 4e). Pepstatin A is a 
known inhibitor of acid proteases”. Among potential pepstatin targets, 
we explored BACE2 because RNA-seq data showed upregulation in 
ES-cell-derived NC lineages (Extended Data Fig. 9a), and BACE2 had 
been recently shown to modulate the migration of NC derivatives in 
the developing zebrafish embryo”’. To address whether BACE2 medi- 
ates the pepstatin A effect, we tested structurally unrelated small mol- 
ecules targeting BACE2. Exposure to the BACE inhibitor IV rescued 
the migration defect in the scratch assay similar to pepstatin A (Fig. 4f). 
Furthermore, BACE2 knockdown confirmed rescue of the migration 
defect in EDNRB ~~ cells (Fig. 4g and Extended Data Fig. 9b). Finally, 
we tested whether pepstatin A exposure in vitro is sufficient to rescue the 
in vivo migration behaviour of EDNRB~’~ ENC precursors (Fig. 4h). 
ENC precursors derived from EDNRB ‘~ cells exhibit a significant in vivo 
migration defect after transplantation into the adult colon (Fig. 4i, j). 
EDNRB-null precursors pre-treated with pepstatin A for 72h before 
transplantation showed a significant rescue of in vivo migration (Fig. 4i, j 
and Extended Data Fig. 9c). Interestingly, wild-type-derived ENC pre- 
cursors treated with pharmacological inhibitors of EDNRB showed 
migration defects in vitro and in vivo (Extended Data Fig. 9d-f), further 
supporting a role for EDNRB in human ENC migration and HSCR. 

Our study describes an efficient strategy to derive and prospectively 
purify ENC precursors from human ES cells. In agreement with studies 
in model organisms!”, we demonstrate that human ES-cell-derived 
ENC gives rise to a broad range of neurotransmitter phenotypes char- 
acteristic of the ENS. The ability to model human ENS development 
in vitro should enable the large-scale production of specific human 
enteric neurons on demand. For example human PS-cell-derived 
enteric 5-HT neurons could serve as a tool to model gastrointestinal 
side effects of CNS-acting drugs such as Prozac!. 

We focused primarily on potential cell therapeutic applications of 
ES-cell-derived ENC lineages in HSCR. One of the most remarkable 
findings was the extensive in vivo migratory potential of human ENC 
precursors in the adult host colon. Future studies will have to define 
whether this ability is confined to early CD49D* cells or maintained 
at later neurogenic stages. Similarly, it will be important to look at 
long-term efficacy and safety. While most of our in vivo studies in 
NSG mice (a total of 102 animals grafted) were limited to a 5-6-week 
survival period, animals analysed at 3-4 months after transplantation 
showed comparable in vivo properties without evidence of tumour 
formation or graft-related adverse effects. The therapeutic potential of 
the cells is illustrated by their ability to rescue Ednrb‘"<" mice. Future 
studies will be required to address mechanisms of the graft-mediated 
host rescue. The potential for widespread engraftment may eventually 
enable permanent, bona fide repair of the aganglionic portions of the 
gut. However, given the rapid action of the cells in preventing death of 
HSCR mice, it appears unlikely that rescue was mediated by functional 
integration of the cells. Future studies should also address alternative 
mechanisms such as cytokine release, immunomodulation or changes 
in barrier function for contributing to the therapeutic effect. 

The identification of pepstatin A and BACE2 inhibition in rescuing 
HSCR-related migration defects represents proof-of-concept for the 
use of human ES-cell-derived ENC precursors in drug discovery. The 
mechanism of BACE2 inhibition on migration remains to be determined. 
Possible targets of BACE proteases include neuregulins and ErbB receptors 
previously implicated in NC migration”*”°. One obvious future direction 
is testing the therapeutic potential of BACE inhibitors in mouse models 
of HSCR. Such a strategy could enable the prevention of aganglionosis 
during pregnancy or enable repair of postnatal enteric neuron function. 
In the current study, we focused on combining pepstatin A as a neoadju- 
vant treatment to promote migration of EDNRB ~/~ ENC precursors. An 
important next question is whether cells pre-treated with pepstatin A are 
capable of rescuing lethality or other disease-associated phenotypes in 
HSCR mice. In conclusion, our work presents a powerful strategy to access 
human ENS lineages for exploring the second brain! in human health and 
disease and for developing novel cell- and drug-based therapies for HSCR. 
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METHODS 

Culture of undifferentiated human ES cells. Human ES cell line H9 (WA-09) and 
derivatives (SOX 10::GEP; SYN::ChR2-EYFP; SYN::EYFP;PHOX2B:GFP;EF1::RFP 
Ednrb~/~) as well as two independent human iPS cell lines (healthy and familial 
dysautonomia, Sendai-based, OMSK (Cytotune)) were maintained on mouse 
embryonic fibroblasts (Global Stem) in knockout serum replacement (KSR; 
Life Technologies, 10828-028) containing human ES cell medium as described 
previously”. Cells were subjected to mycoplasma testing at monthly intervals and 
short tandem repeats (STR) profiled to confirm cell identity at the initiation of the 
study. 

Neural crest induction. Human ES cells were plated on matrigel (BD Biosciences, 
354234)-coated dishes (10° cells cm~”) in ES cell medium containing 10nM FGF2 
(R&D Systems, 233-FB-001MG/CEF). Differentiation was initiated in KSR medium 
(knockout DMEM plus 15% KSR (Life Technologies, 10828-028), L-glutamine 
(Life Technologies, 25030-081), NEAA (Life Technologies, 11140-050)) contain- 
ing LDN193189 (100 nM, Stemgent) and $B431542 (101M, Tocris). The KSR 
medium was gradually replaced with increasing amounts of N2 medium from 
day 4 to day 10 as described previously’. For CNC induction, cells were treated 
with 31M CHIR99021 (Tocris Bioscience, 4423) in addition to LDN193189 and 
SB431542 from day 2 to day 11. ENC differentiation involves additional treatment 
with retinoic acid (1,1M) from day 6 to day 11. For deriving MNCs, LDN193189 
is replaced with BMP4 (10nM, R&D, 314-BP) and EDN3 (10nM, American 
Peptide company, 88-5-10B) from day 6 to day 11 (ref. 3). The differentiated cells 
are sorted for CD49D at day 11. CNS precursor control cells were generated by 
treatment with LDN193189 and $B431542 from day 0 to day 11 as previously 
described’. Throughout the manuscript, day 0 is the day the medium is switched 
from human ES cell medium to LDN193189 and SB431542 containing medium. 
Days of differentiation in text and figures refer to the number of days since the 
pluripotent stage (day 0). 

FACS and immunofluorescence analysis. For immunofluorescence, the cells 
were fixed with 4% paraformaldehyde (Affymetrix-USB, 19943) for 20 min, then 
blocked and permeabilized using 1% bovine serum albumin (BSA) (Thermo 
Scientific, 23209) and 0.3% Triton X-100 (Sigma, T8787). The cells were then 
incubated in primary antibody solutions overnight at 4°C and stained with 
fluorophore-conjugated secondary antibodies at room temperature for 1h. The 
stained cells were then incubated with DAPI (1 ng ml}, Sigma, D9542-5MG) 
and washed several times before imaging. For flow cytometry analysis, the 
cells are dissociated with Accutase (Innovative Cell Technologies, AT104) and 
fixed and permeabilized using BD Cytofix/Cytoperm (BD Bioscience, 554722) 
solution, then washed, blocked and permeabilized using BD Perm/Wash buffer 
(BD Bioscience, 554723) according to manufacturer’s instructions. The cells 
are then stained with primary (overnight at 4°C) and secondary (30 min at 
room temperature) antibodies and analysed using a flow Cytometer (Flowjo 
software). A list of primary antibodies and working dilutions is provided in 
Supplementary Table 4. The PHOX2A antibody was provided by J.-F. Brunet 
(rabbit, 1:800 dilution). 

In ovo transplants. Fertilized eggs (from Charles River Farms) were incubated at 
37°C for 50h before injections. A total of 2 x 10° CD49D-sorted, RFP-labelled NC 
cells were injected into the intersomitic space of the vagal region of the embryos 
targeting a region between somite 2 and 6 (HH 14 embryo, 20-25 somite stage). 
The embryos were collected 36 h later for whole-mount epifluorescence and 
histological analyses. 

Gene expression analysis. For RNA sequencing, total RNA was extracted using 
RNeasy RNA purification kit (Qiagen, 74106). For qRT-PCR assay, total RNA sam- 
ples were reverse transcribed to cDNA using Superscript II Reverse Transcriptase 
(Life Technologies, 18064-014). RT-PCR reactions were set up using QuantiTect 
SYBR Green PCR mix (Qiagen, 204148). Each data point represents three 
independent biological replicates. 

In vitro differentiation of ENC to enteric neurons. ENC cells from the 11-day 
induction protocol were aggregated into 3D spheroids (5 million cells per well) in 
Ultra Low Attachment 6-well culture plates (Fisher Scientific, 3471) and cultured 
in Neurobasal (NB) medium supplemented with L-glutamine (Gibco, 25030-164), 
N2 (Stem Cell Technologies, 07156) and B27 (Life Technologies, 17504044) con- 
taining CHIR99021 (31M, Tocris Bioscience, 4423) and FGF2 (10 nM, R&D 
Systems, 233-FB-001MG/CF). After 4 days of suspension culture, the spheroids 
are plated on poly-ornithine/laminin/fibronectin (PO/LM/EN)-coated dishes 
(prepared as described previously”®) in neurobasal (NB) medium supplemented 
with L-glutamine (Gibco, 25030-164), N2 (Stem Cell Technologies, 07156) and B27 
(Life Technologies, 17504044) containing GDNF (25 ng ml |, Peprotech, 450-10) 
and ascorbic acid (100|1M, Sigma, A8960-5G). The ENC precursors migrate out 
of the plated spheroids and differentiate into neurons in 1-2 weeks. The cells were 
fixed for immunostaining or collected for gene expression analysis at days 25, 40 
and 60 of differentiation. 


Induction of SMCs. Mesoderm specification is carried out in STEMPRO-34 
(Gibco, 10639-011) medium. The ES cells are subjected to activin A treatment (100 
ng ml~!, R&D, 338-AC-010) for 24h followed by BMP4 treatment (10 ng ml, 
R&D, 314-BP) for 4 days’. The cells are then differentiated into SMC progenitors 
by treatment with PDGF-BB (5 ng ml}, Peprotech, 100-14B), TGFb3 (5ng ml}, 
R&D systems, 243-B3-200) and 10% FBS. The SMC progenitors are expandable 
in DMEM supplemented with 10% FBS. 

ENC-SMC co-culture. The SMC progenitors were plated on PO/LM/FN-coated 
culture dishes (prepared as described previously”®) 3 days before addition of ENC- 
derived neurons. The neurons were dissociated (using accutase, Innovative Cell 
Technologies, AT104) at day 30 of differentiation and plated onto the SMC mon- 
olayer cultures. The culture is maintained in neurobasal (NB) medium supple- 
mented with L-glutamine (Gibco, 25030-164), N2 (Stem Cell Technologies, 07156) 
and B27 (Life Technologies, 17504044) containing GDNF (25 ng ml", Peprotech, 
450-10) and ascorbic acid (100|1M, Sigma, A8960-5G). Functional connectivity 
was assessed at 8-16 weeks of co-culture. 

Pharmacological and optogenetic stimulations of co-cultured SMCs. SMC- 
only and SMC-ENC-derived neuron co-cultures were subjected to acetylcholine 
chloride (501M, Sigma, A6625), carbamoylcholine chloride (101M, Sigma,C4382) 
and KCl (55 mM, Fisher Scientific, BP366-500) treatment, 3 months after initiating 
the co-culture. Optogenetic stimulations were performed using a 450-nm pigtailed 
diode pumped solid state laser (OEM Laser, PSU-III LED, OEM Laser Systems, Inc.) 
achieving an illumination between 2 and 4mW mm ~*. The pulse width was 
4 ms and stimulation frequencies ranged from 2 to 10 Hz. For the quantification 
of movement, images were assembled into a stack using Metamorph software 
and regions with high contrast were identified (labelled yellow in Supplementary 
Fig. 5). The movement of five representative high-contrast regions per field was 
automatically traced (Metamorph software). Data are presented in kinetograms 
as movement in pixels in x and y direction (distance) with respect to the previous 
frame. 

Generation of chimaeric tissue-engineered colon. We used the previously 
described method for generation of tissue-engineered colon’. In brief, the donor 
colon tissue was collected and digested into organoid units using dispase (Life 
Technologies, 17105-041) and collagenase type 1 (Worthington, CLS-1). The 
organoid units were then mixed immediately (without any in vitro culture) with 
CD49D-purified human ES-cell-derived ENC precursors (day 15 of differenti- 
ation) and seeded onto biodegradable polyglycolic acid scaffolds (2-mm sheet 
thickness, 60 mg cm? bulk density; porosity >95%, Concordia Fibres) shaped 
into 2 mm long tubes with poly- lactide (PLLA) (Durect Corporation). The 
seeded scaffolds were then placed onto and wrapped in the greater omentum of 
the adult (>2 months old) NSG mice. Just before the implantation, these mice were 
irradiated with 350 cGy. The seeded scaffolds were differentiated into colon-like 
structures inside the omentum for 4 additional weeks before they were surgically 
removed for tissue analysis. 

Transplantation of ENC precursors in adult colon. All mouse procedures were 
performed following NIH guidelines, and were approved by the local Institutional 
Animal Care and Use Committee (IACUC), the Institutional Biosafety Committee 
(IBC) as well as the Embryonic Stem Cell Research Committee (ESCRO). We used 
3-6-week-old male NSG (NOD.Cg-Prkde*“ Tl2rg'"!W"/SzJ) mice or 2-3-week-old 
Ednrb*"*! (SSL/LeJ) mice?’ (n= 12, 6 male, 6 female) for these studies. Animal 
numbers were based on availability of homozygous hosts and on sufficient statis- 
tical power to detect large effects between treatment versus control (Ednrbs/s") 
as well as for demonstrating robustness of migration behaviour (NSG). Animals 
were randomly selected for the various treatment models (NSG and Ednrb*"*") 
but assuring for equal distribution of male/female ratio in each group (Ednrb*"*"), 
All in vivo experiments were performed in a blinded manner. Animals were 
anaesthetized with isoflurane (1%) throughout the procedure, a small abdominal 
incision was made, abdominal wall musculature lifted and the caecum is exposed 
and exteriorized. Warm saline is used to keep the caecum moist. Then 201 of cell 
suspension (2-4 million RFP* CD49D-purified human ES-cell-derived ENC pre- 
cursors) in 70% Matrigel (BD Biosciences, 354234) in PBS or 2011 of 70% Matrigel 
in PBS only (control-grafted animals) were slowly injected into the caecum 
(targeting the muscle layer) using a 27-gauge needle. Use of 70% matrigel as carrier 
for cell injection assured that the cells stayed in place after the injection and pre- 
vented backflow into the peritoneum. After injection that needle was withdrawn, 
and a Q-tip was placed over the injection site for 30 s to prevent bleeding. The cae- 
cum was returned to the abdominal cavity and the abdominal wall was closed using 
4-0 vicryl and a taper needle in an interrupted suture pattern and the skin was closed 
using sterile wound clips. After wound closure animals were put on paper on top 
of their bedding and attended until conscious and preferably eating and drinking. 
The tissue was collected at different time points (ranging from two weeks to four 
months) after transplantation for histological analysis. Ednrb*"*! mice were immu- 
nosuppressed by daily injections of cyclosporine (10 mg kg”! i.p, Sigma, 30024). 
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Whole-mount fluorescence imaging and histology. The collected colon samples 
were fixed in 4% paraformaldehyde at 4 °C overnight before imaging. Imaging 
is performed using Maestro fluorescence imaging system (Cambridge Research 
and Instrumentation). The tissue samples were incubated in 30% sucrose (Fisher 
Scientific, BP220-1) solutions at 4°C for 2 days, and then embedded in OCT (Fisher 
Scientific, NC9638938) and cryosectioned. The sections were then blocked with 
1% BSA (Thermo Scientific, 23209) and permeabilized with 0.3% Triton X-100 
(Sigma, T8787). The sections are then stained with primary antibody solution at 
4°C overnight and fluorophore-conjugated secondary antibody solutions at room 
temperature for 30 min. The stained sectioned were then incubated with DAPI 
(1 ng ml}, Sigma, D9542-5MG) and washed several times before they were 
mounted with Vectashield Mounting Medium (vector, H1200) and imaged using 
fluorescent (Olympus IX70) or confocal microscopes (Zeiss SP5). 

Total gastrointestinal transit time. Mice are gavaged with 0.3 ml of dye solution 
containing 6% carmine (Sigma, C1022-5G), 0.5% methylcellulose (Sigma, 274429- 
5G) and 0.9 NaCl, using a #24 round-tip feeding needle. The needle was held inside 
the mouse oesophagus for a few seconds after gavage to prevent regurgitation. 
After 1h, the stool colour was monitored for gavaged mice every 10 min. For each 
mouse, total gastrointestinal transit time is between the time of gavage and the 
time when red stool is observed. 

Gene targeting. The double nickase CRISPR/Cas9 system”® was used to target the 
EDNRB locus in EF1-RFP H9 human ES cells. Two guide RNAs were designed 
(using the CRISPR design tool; http://crispr.mit.edu/) to target the coding sequence 
with PAM targets ~20 base pairs apart (qRNA #1 target specific sequence: 
5/-AAGTCTGTGCGGACGCGCCCTGG-3’, RNA #2 target specific sequence: 
5’-CCAGATCCGCGACAGGCCGCAGG-3’). The cells were transfected with 
guide RNA constructs and GFP-fused Cas9-D10A nickase. The GFP-expressing 
cells were FACS purified 24 h later and plated in low density (150 cells cm~) on 
mouse embryonic fibroblasts. The colonies were picked 7 days later and passaged 
twice before genomic DNA isolation and screening. The targeted region of EDNRB 
gene was PCR amplified (forward primer: 5’-ACGCCTTCTGGAGCAGGTAG-3’, 
reverse primer: 5’/-GTCAGGCGGGAAGCCTCTCT-3’) and cloned into Zero 
Blunt TOPO vector (Invitrogen, 450245). To ensure that both alleles (from each 
ES cell colony) are represented and sequenced, we picked 10 bacterial clones (for 
each ES cell clone) for plasmid purification and subsequent sequencing. The clones 
with bi-allelic nonsense mutations were expanded and differentiated for follow-up 
assays. 

Migration assay. The ENC cells are plated on PO/LM/EN coated (prepared as 
described previously”®) 96-well or 48-well culture plates (30,000 cm 7). After 24h, 
the culture lawn is scratched manually using a pipette tip. The cells are given an 
additional 24-48 h to migrate into the scratch area and fixed for imaging and 
quantification. The quantification is based on the percentage of the nuclei that are 
located in the scratch area after the migration period. The scratch area is defined 
using a reference well that was fixed immediately after scratching. Migration 
of cells was quantified using the open source data analysis software KNIME” 
(http://knime.org) with the ‘quantification in ROT plug-in as described in detail 
elsewhere*?. 

Proliferation assay. To quantify proliferation, FACS-purified ENC cells were 
assayed using CVQUANT NF cell proliferation Assay Kit (Life Technologies, 
C35006) according to manufacturer’ instructions. In brief, to generate a stand- 
ard, cells were plated at various densities and stained using the fluorescent DNA 
binding dye reagent. Total fluorescence intensity was then measured using a plate 
reader (excitation at 485 nm and emission detection at 530 nm). After determining 
the linear range, the CD49D* wild-type and EDNRB~/~ ENC precursors were 
plated (6,000 cell cm~7) and assayed at 0, 24, 48 and 72h. The cells were cultured in 
neurobasal (NB) medium supplemented with L-glutamine (Gibco, 25030-164), N2 
(Stem Cell Technologies, 07156) and B27 (Life Technologies, 17504044) containing 
CHIR99021 (31M, Tocris Bioscience, 4423) and FGF2 (10 nM, R&D Systems, 
233-FB-001MG/CF) during the assay. 

Viability assay. To monitor the viability of wild-type and EDNRB~/~ ENC precur- 
sors, cells were assayed for lactate dehydrogenase (LDH) activity using CytoTox 
96 cytotoxicity assay kit (Promega, G1780). In brief, the cells are plated in 96-well 
plates at 30,000cm ?. The supernatant and the cell lysate is collected 24 h later and 
assayed for LDH activity using a plate reader (490 nm absorbance). Viability is 
calculated by dividing the LDH signal of the lysate by total LDH signal (from lysate 
plus supernatant). The cells were cultured in neurobasal (NB) medium supple- 
mented with L-glutamine (Gibco, 25030-164), N2 (Stem Cell Technologies, 07156) 
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and B27 (Life Technologies, 17504044) containing CHIR99021 (31M, Tocris 
Bioscience, 4423) and FGF2 (10 nM, R&D Systems, 233-FB-001MG/CF) during 
the assay. 

High-throughput screening. The chemical compound screening was per- 
formed using the Prestwick Chemical Library. The ENC cells were plated in 
96-well plates (30,000 cm~*) and scratched manually 24h before addition of the 
compounds. The cells were treated with two concentrations of the compounds 
(10}.M and 141M). The plates were fixed 24h later for total plate imaging. The 
compounds were scored based on their ability to promote filling of the scratch 
in 24h. The compounds that showed toxic effects (based on marked reduc- 
tion in cell numbers assessed by DAPI staining) were scored 0, compounds 
with no effects were scored 1, compounds with moderate effects were scored 
2, and compounds with strong effects (that resulted in complete filling of the 
scratch area) were scored 3 and identified as hit compounds. The hits were 
further validated to ensure reproducibility. The cells were treated with various 
concentrations of the selected hit compound (pepstatin A) for dose response 
analysis. The optimal dose (10|1M based on optimal response and viability) 
was used for follow-up experiments. For the pre-treatment experiments, cells 
were CD49D purified at day 11 and treated with pepstatin A from day 12 to day 
15 followed by transplantation into the colon wall of NSG mice. The cells were 
cultured in neurobasal (NB) medium supplemented with L-glutamine (Gibco, 
25030-164), N2 (Stem Cell Technologies, 07156) and B27 (Life Technologies, 
17504044) containing CHIR99021 (31M, Tocris Bioscience, 4423) and FGF2 
(10 nM, R&D Systems, 233-FB-001MG/CEF) during the assay. 

BACE2 inhibition and knockdown. To inhibit BACE2, the ENC precursors 
were treated with 1 \.M 3-secretase inhibitor IV (CAS 797035-11-1; Calbiochem). 
To knockdown BACE2, cells were dissociated using accutase (Innovative Cell 
Technologies, AT104) and reverse-transfected (using Lipofectamine RNAiMAX- 
Life Technologies, 13778-150) with an siRNA pool (SMARTpool: ON-TARGETplus 
BACE2 siRNA, Dharmacon, L-003802-00-0005) or four different individual siR- 
NAs (Dharmacon, LQ-003802-00-0002, 2 nmol). The knockdown was confirmed 
by qRT-PCR measurement of BACE2 mRNA levels in cells transfected with the 
BACE2 siRNAs versus the control siRNA pool (ON-TARGETplus Non-targeting 
Pool, Dharmacon, D-001810-10-05). The transfected cells were scratched 24h after 
plating and fixed 48h later for migration quantification. The cells were cultured in 
neurobasal (NB) medium supplemented with L-glutamine (Gibco, 25030-164), N2 
(Stem Cell Technologies, 07156) and B27 (Life Technologies, 17504044) containing 
CHIR99021 (3|1M, Tocris Bioscience, 4423) and FGF2 (10 nM, R&D Systems, 
233-FB-001MG/CEF) during the assay. 

Statistical analysis. Data are presented as mean + s.e.m. and were derived from 
at least three independent experiments. Data on replicates (1) is given in figure 
legends. Statistical analysis was performed using the Student's t-test (compar- 
ing two groups) or ANOVA with Dunnett test (comparing multiple groups 
against control). Distribution of the raw data approximated normal distribution 
(Kolmogorov-Smirnov normality test) for data with sufficient number of rep- 
licates to test for normality. Survival analysis was performed using a log-rank 
(Mantel-Cox) test. Z-scores for primary hits were calculated as Z=(x — 1)/o, 
in which x is the migration score value and is 3 for all hit compounds; ju is the 
mean migration score value, and a is the standard deviation for all compounds 
and DMSO controls (n = 224). 
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Extended Data Figure 1 | Characterization of ES-cell-derived NC differentially expressed transcripts from the RNA-seq analysis of CNC 
populations. a, Schematic illustration of CNC and MNC induction compared to stage matched CNS precursors*". h, Lists of the top 10 
protocols’. b, Flow cytometry for CD49D and SOX10-GFP in CNC and selected additional (bold) most differentially expressed transcripts 
and MNC cells. c, Immunofluorescence of unsorted and CD49D-sorted from the RNA-seq analysis of MNC compared to stage-matched CNS 
differentiated NC cells for SOX10. d, Flow cytometry for CD49D in ENCs precursors*". i, Distribution of CNC and MNC cells in developing 
derived from H9 human ES cells and control and familial dysautonomia chick embryos at 24-36 h after injection. Right, higher power image of 
(FD) human iPS cells. e, f, Representative immunofluorescence images the clusters of MNC cells in the developing surface ectoderm. NotoC, 
and flow cytometry in ES-cell-derived ENC for enteric precursor lineage notochord; NT, neural tube; S, somite. Scale bars, 501m (c, i, middle), 


marker at day 11. g, List of the top 10 and selected additional (bold) most 25 1m (e, i, right) and 1 mm (i, left). 
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Extended Data Figure 2 | Characterization of human ES-cell-derived in vitro differentiation periods; n = 3 independent experiments. 
enteric neural lineages. a, Flow cytometry for TRKC and PHOX2A d, e, Flow cytometric quantification of enteric neuron precursor 
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course qRT-PCR analysis of enteric lineage markers during more extended 
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neurons were detected under cranial conditions despite the presence of 
many TUJ1* neurons and increased percentages of tyrosine-hydroxylase- 
positive cells. d, e, Flow cytometry for TRKB and TRKC under CNC and 
ENC conditions; n = 3 independent experiments. Scale bars, 50,1m. 
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Extended Data Figure 4 | Functional characterization of ES-cell-derived — co-cultures of SMCs with ENC-derived neurons. Top, phase-contrast 


enteric neurons in co-culture with SMCs. a, Schematic illustration of images showing morphological changes of SMCs in co-culture. Bottom, 
SMC differentiation protocol. b, Immunofluorescence staining of SMC immunofluorescence staining of mature SMC markers MYH11 and AchR 
progenitors for SMA and ISL1. c, Association of various ENC-derived in monoculture of SMCs versus co-cultures of SMCs with ENC-derived 
neuron subtypes with SMA* cells. d, SYN-eGFP expression in ENC- neurons. f, Diagrams representing extent of contraction of SMC cultures. 
derived neurons at 40 days of co-culture with SMCs and stage-matched Arrows indicate the time of pharmacological stimulation. Scale bars, 
neurons in the absence of SMCs. e, Monoculture of SMCs versus 50m (b, c, e) and 100 1m (d). 


© 2016 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
— 
CD49D purified 
Intestinal ENC precursors 


organoid units (day 15) 


t>»/  ™ 


Seeded scaffold 


& on omentum 


Tissue engineered 
intestine 


Extended Data Figure 5 | Generation of tissue-engineered colon using 
human ES-cell-derived ENC precursors. a, Schematic illustration 
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synaptophysin (hSyn). Dotted line shows approximate location of border 
between muscle and epithelial/submucosal-like layers. H&E, haematoxylin 
and eosin. Scale bars, 201m (b, left and middle) and 40 um (b, right). 
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Extended Data Figure 6 | Characterization of transplanted human 
ES-cell-derived ENC precursors in adult colon of NSG and Ednrb**! 
mice. a, Whole-mount microscopy of colon transplanted with RFP* 
CD49D-purified ES-cell-derived ENC precursors to track RFP expression 
at injection site, at 1h after transplantation to ensure that cells were 
injected at proper location (left), and at 2 weeks to show dispersal of 

the cells and congregation of subset of cells into distinct clusters (right). 
The dashed lines indicate the outer border of the intact colon tissue. 

b, c, Whole-mount fluorescence imaging and quantification of migration 
of grafted RFP* ES-cell-derived CNS precursors and CD49D-purified 
CNC precursors inside the adult colon wall. d, Megacolon-like phenotype 
in control animals versus animals receiving ES-cell-derived ENC 
transplants. e, f, Whole-mount fluorescence imaging and quantification 
of migration of grafted RFP CD49D purified ES-cell-derived ENC 
precursors in colon of Ednrb*“*" mice. g, Total gastrointestinal transit 
times by carmine dye gavage of Ednrb*"”*! mice grafted with REP* 
CD49D-purified ES-cell-derived ENC precursors versus Matrigel-only 


$C121/TUJ1/DAPI TUJ1/hSYN/DAPI 


matrigel injected 


$C123/DAPI 


$C121/5HT/DAPI $C121/NOS/DAPI 


I 
$C121/TUJ1/DAPI 


grafted mice; n =3 for grafted animals, n = 2 for Matrigel group. Note 
that n =2 for Matrigel group was due to the fact that nearly all Matrigel- 
injected animals died owing to their disease phenotype. h, Representative 
images of grafted ES-cell-derived ENC precursors at 3 months after 
transplantation into the colon of Ednrb*“*" mice co-expressing TUJ1 

and $C121. i, Immunofluorescence staining of cross sections of Ednrbs/! 
colons transplanted with Matrigel for SC121 and human-specific 
synaptophysin. j, Representative image of grafted ES-cell-derived ENC 
precursors at 3 months after transplantation into the colon of Ednrb*!*" 
mice expressing human-specific GFAP (SC123). k, l, Representative 
images of grafted ES-cell-derived ENC precursors at 6 weeks after 
transplantation into the colon of NSG (wild type) and Ednrb*“*" mice. 
The dashed lines indicate the border between submucosal and muscle 
layers. Scale bars, 200 1m (a), 1 cm (b, e) and 100,1m (h-l). AU, arbitrary 
units. P value for g is given numerically, t-test with Welch's correction; 
n=3 independent experiments. 
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Extended Data Figure 7 | Establishing and characterizing EDNRB-null 
human ES cell lines. a, Sequences of wild-type and Cas9-nickase induced 
bi-allelic nonsense mutations in targeted region of Ednrb~‘~ clones. 

b, Western blot analysis for EDNRB in ES-cell-derived ENC precursors 
showing lack of protein expression in the mutant clones C1-C4. 

c, EDNRB~'~ human ES cells can be efficiently differentiated into CD49D* 


human ES-cell-derived ENC based on CD49D expression (c) and 
expression of SOX10 (data not shown). d, Proliferation of EDNRB oe 
human ES-cell-derived ENCs (day 11) versus wild-type-derived cells; n= 4 
independent experiments. e, LDH activity measurement of cell viability in 
EDNRB’~ ES-cell-derived ENCs (day 11) versus wild-type-derived cells. 
*P < 0.05 (t-test; n= 3 independent experiments). 
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Extended Data Figure 8 | Chemical screening for compounds that control wells. c, Migration scores of Prestwick library compounds and 
rescue migration of EDNRB~/~ ES-cell-derived NC precursors. DMSO controls. d, Migration assay and scores for EDNRB~'~ ES-cell- 

a, Schematic illustration of the timeline and experimental steps involved derived ENC precursors treated with primary hit compounds. Z-score for 
in the chemical screening assay and migration scoring system. b, Example primary hit compounds in c is given numerically (compared to DMSO 

of a screening plate layout and locations of dimethylsulfoxide (DMSO) control; n = 224 technical replicates). 
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Extended Data Figure 9 | Pharmacological modulation of migration in CD49D-purified wild-type and Ednrb~'~ ES-cell-derived ENC precursors, 


human ES-cell-derived ENC precursors. a, BACE2 expression in the with or without pepstatin A pre-treatment (compare to Fig. 4j). 

various human ES-cell-derived NC sublineages at day 11 as compared d, Representative images of wild-type CD49D-purified ES-cell-derived 
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Mutant Kras copy number defines metabolic 
reprogramming and therapeutic susceptibilities 


Emma M. Kerr!, Edoardo Gaudel, Frances K. Turrell!, Christian Frezza! & Carla P. Martins! 


The RAS/MAPK (mitogen-activated protein kinase) signalling 
pathway is frequently deregulated in non-small-cell lung cancer, 
often through KRAS activating mutations. A single endogenous 
mutant Kras allele is sufficient to promote lung tumour formation 
in mice but malignant progression requires additional genetic 
alterations*’’. We recently showed that advanced lung tumours 
from Kras°!?”'*;53-null mice frequently exhibit Kras®!”” allelic 
enrichment (Kras!?)/Kras”!4¢ > 1) (ref. 7), implying that 
mutant Kras copy gains are positively selected during progression. 
Here we show, through a comprehensive analysis of mutant Kras 
homozygous and heterozygous mouse embryonic fibroblasts and 
lung cancer cells, that these genotypes are phenotypically distinct. 
In particular, Kras¢!79!¢!2 cells exhibit a glycolytic switch coupled 
to increased channelling of glucose-derived metabolites into the 
tricarboxylic acid cycle and glutathione biosynthesis, resulting in 
enhanced glutathione-mediated detoxification. This metabolic 
rewiring is recapitulated in mutant KRAS homozygous non- 
small-cell lung cancer cells and in vivo, in spontaneous advanced 
murine lung tumours (which display a high frequency of Kras¢!7? 
copy gain), but not in the corresponding early tumours (Kras©!7? 
heterozygous). Finally, we demonstrate that mutant Kras copy 
gain creates unique metabolic dependences that can be exploited 
to selectively target these aggressive mutant Kras tumours. Our 
data demonstrate that mutant Kras lung tumours are not a single 
disease but rather a heterogeneous group comprising two classes of 
tumours with distinct metabolic profiles, prognosis and therapeutic 
susceptibility, which can be discriminated on the basis of their 
relative mutant allelic content. We also provide the first, to our 
knowledge, in vivo evidence of metabolic rewiring during lung 
cancer malignant progression. 

The Ras pathway’ is frequently upregulated during the malignant 
progression of mutant Kras tumours>”, indicating that this transition 
requires further increased Ras activity. But how this activity may con- 
tribute to malignant progression remains unclear. We recently iden- 
tified mutant Kras (Kras""’) copy gains in high-grade murine lung 
tumours’ and mutant-specific gains have also been reported in non- 
small-cell lung cancer (NSCLC)". We thus hypothesized that the gain 
of a second Kras"' copy affords additional oncogenic phenotypes to 
Kras heterozygous cells. To identify such potential gain-of-function 
phenotypes, we compared the acute impact of Kras°/??-endogenous 
allele! activation in heterozygous and homozygous Kras°!”? mouse 
embryonic fibroblasts (MEFs). MEFs were generated on a p53-null 
background”? (Extended Data Fig. 1a) to recapitulate the tumour gen- 
otype in which Kras®!” copy gains were identified” but, for simplicity, 
hereafter they will be termed KrasWi4-type/wild-type (WT/WT) KyagGl2D/WT 
and Kras@12b/G12D_ 

As reported!!, Kras©!??/"T cells showed a proliferative advantage rel- 
ative to Kras“!/“T MEFs. Surprisingly, Kras¢?P/“7 and Kras@2P/G12P 
cells grew similarly at early passages (P1-P5) (Fig. 1a, b), indicating 
that proliferation is not directly affected by Kras” copy gain. A growth 
advantage of Kras@!?’G!?P cells was nevertheless observed after P6. 


To identify both immediate and proliferation-independent Kras¢!”? 


copy-gain-dependent effects, subsequent analyses were restricted to 
early passages. While KRAS amplifications are typically associated 
with increased expression?"!°, Kras@!2P/WT and Kras@!2P/G12D Rags 
protein levels were comparable and only slightly increased relative 
to KrasW!/“1 MEFs. Nevertheless, Kras@!2?/!2P MEBs exhibited a 
~2-fold increase in activated Ras relative to Kras@!?>/"" cells (Fig. 1c), 
indicating that mutant copy gain may have functional implications. 
In agreement, microarray analysis identified 1,666 genes differentially 
regulated (>1.3-fold) between Kras@!29/G22D and Kras@!?>/WT MEFs, 
with glycolysis being the most significantly altered pathway (Fig. 1d 
and Extended Data Fig. 1b). 

Mutant Kras activity enhances glucose uptake and rewires glucose 
metabolism into the hexosamine biosynthesis and pentose phosphate 
pathways in pancreatic ductal adenocarcinoma’. However, its met- 
abolic impact on other cancer types and, more importantly, that of 
Kras"™ copy gain is unclear. Kras@!?)/WT and Kras“ /“T MEFs showed 
similar glycolytic gene expression profiles with the exception of Slc2a1 
(Glut1) and Slc2a3 (Glut3, data not shown). In contrast, in Kras@!29/612P 
MEFs, glycolytic gene expression was significantly upregulated 
and mirrored by increased glucose uptake, lactate secretion and 
glycolytic capacity (Fig. le, f and Extended Data Fig. 1c, d). Thus, we 
show that Kras°’* copy gain induces a glycolytic switch while a Kras 
mutation per se is not sufficient to upregulate glycolysis. Notably, analysis 
of murine lung tumour cell lines with distinct Kras G12D/WT allelic 
content revealed a direct correlation between increased Kras°!”? copy 
number (Kras°!"/total Kras) and enhanced GTPase activity as well 
as glycolysis (extracellular acidification rate, ECAR), consistent with 
a ‘Kras’"'-dosage effect. Glycolytic gene expression, glucose uptake 
and lactate secretion were also significantly enhanced in Kras¢17)/G?P 
relative to heterozygous lung tumour cells (Fig. 1g and Extended Data 
Fig. le-g). Importantly, KRAS” homozygosity is highly prevalent 
(48.6%) within mutant KRAS NSCLC cell lines (COSMIC), empha- 
sizing its relevance and enabling the validation of our findings in a 
clinically relevant NSCLC model. Reassuringly, the distinct glycolytic 
phenotypes of KRAS”’ heterozygous and homozygous cells were 
confirmed in NSCLC cells (Fig. 1h and Extended Data Table 1), demon- 
strating that glycolysis upregulation is a Kras’" copy-gain-associated 
gain-of-function. 

Enhanced glycolysis is a well-recognized cancer phenotype typically 
associated with increased growth demands, and/or compensatory 
adaptation to mitochondrial defects'*’>. Yet, early passage KrasG12D/G12D 
MEFs displayed a glycolytic switch relative to Kras@!?)/T cells despite 
exhibiting comparable proliferative rates, cell volume, diameter, protein 
and RNA content (Fig. 1a, b and Extended Data Fig. 2a—d). Furthermore, 
mitochondrial morphology and function were similar across genotypes 
(Extended Data Fig. 2e-h), despite a Kras©!?-associated decrease in 
membrane potential, as reported under overexpression conditions!>'®. 
We then hypothesized that this glycolytic switch reflected alternative 
glucose utilization by Kras¢!2D/G!2D cells. Liquid chromatography-mass 
spectrometry (LC-MS)-based metabolomics analysis confirmed the 
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Figure 1 | Mutant Kras copy gain upregulates glycolysis in MEFs and lung 
tumour cells. a, Proliferative rate of Kras“"/”" (WT/WT), Kras@?22/WT 
(G12D/WT) and Kras©!?9/!2) (G12D/G12D);p53"** MEFs. p53 is also 
known as Trp53 in mice (TP53 in humans). b, Fluorescence-activated 

cell sorting (FACS) analysis denoting BrdU+ MEFs. c, MEF Ras levels 
(immunoblotting) and activation (Raf-GST pull-down, normalized to 
WT/WT). d, Heatmap illustrating differential gene expression between 
G12D/WT and G12D/G12D MEFs (n= 3 per genotype, microarray); top 
canonical pathways altered shown (Ingenuity Pathway Analysis, IPA). 

e, Glycolytic gene expression (MEF microarray-based heatmap). Genes 
significantly upregulated in G12D/G12D cells highlighted (bold red, t-test). 
f, MEF glucose consumption and lactate secretion. g, Left: Kras@??/Kras® 
allelic frequency (pyrosequencing) versus Ras activation or glycolysis (ECAR) 
in Kras@!2D/WT;»53-deficient murine lung tumour cells (n =6) (Pearson's 
correlation). Right: glucose consumption and lactate secretion in G12D/WT 
and G12D/G12D cell line pair (t-test). h, Ras activation (normalized to 
H358), glucose consumption and lactate secretion in KRAS”™ heterozygous 
(HET: H23, H358) or homozygous (HOM: H460, SW1573) NSCLC cell 
lines. c, f, h, One-way analysis of variance (ANOVA). a-c, Representative 
data of three independent MEFs per genotype; d-f, n = 3 per genotype. 

g, h, Histograms: representative data (n= 3 independent experiments). All graphs 
depict triplicate mean + s.d. (error bars). ***P< 0.001; **P< 0.01; *P< 0.05. 


enhanced glycolytic phenotype of Kras@!””"°” cells and, unexpectedly, 


uncovered a significant increase in glucose-derived tricarboxylic acid 
(TCA) cycle metabolites in Kras”” homozygous MEFs and (murine 
and human) lung tumour cells (Fig. 2 and Extended Data Figs 3, 4), 
confirming their intact mitochondrial function. More importantly, 
these data identified a Kras” copy-gain-specific metabolic rewiring 
and uncovered a (TCA-coupled) glucose metabolism signature not pre- 
viously associated with mutant Kras activity. 

Despite their differential glucose utilization, Kras 
Kras@!2)/G2D MEFs had similar oxidative phosphorylation levels 
(Extended Data Fig. 2e), hinting at additional TCA cycle differ- 
ences. Since Kras”™ cells were reported to preferentially utilize glu- 
tamine, rather than glucose, to fuel the TCA cycle’”'’, glutamine 
metabolism was assessed. Kras!??/WT cells had the highest levels of 


G12D/WT and 


relative to Kras”"/“T MEFs (Extended Data Fig. 5a-j). However, unlike 
the genotype-specific glucose metabolism signatures, differential glu- 
tamine utilization could not be consistently recapitulated in tumour 
cells (data not shown), possibly reflecting their proliferative and oxygen 
consumption rate heterogeneity. Thus, Kras@!?W" specific glutamine 
metabolism rewiring is either MEF-specific or masked by other muta- 
tions in cancer cells. 

Enhanced pyruvate dehydrogenase (PDH) activity!’ could explain 
the glucose metabolism reprogramming exhibited by homozygous 
cells. However, since Kras@!2P/G!2D and Kras@!2P/WT MEFs showed 
comparable Pdhela expression and PDH activity (Fig. 3a), we spec- 
ulated that, instead, genotype-specific metabolic requirements 
drive this metabolic switch. Our metabolomics data showed that 
glutathione (GSH) and its precursors serine, glycine and gluta- 
mate were strikingly enriched with glucose-derived carbons in 
Kras™ homozygous cells (Fig. 2 and Extended Data Figs 3c-e, i 
and 4d-f, h, 1, m), implying that Kras™ copy gain rewired glu- 
cose metabolism towards glutathione biosynthesis. Glutamine was 
also more efficiently metabolized towards GSH biosynthesis in 
Kras¢?2D/G2D MEFs (Extended Data Fig. 5g, k). We therefore assessed 
the impact of Kras°!??-gain on redox management. Consistent with 
previous reports!®°, KrasC!?>/WT MEFs showed decreased ROS lev- 
els and an increased NADPH/NADP* ratio relative to KrasW 1/7. 
Nevertheless, Kras@!22/G12D MEFs exhibited a more striking anti- 
oxidant signature, marked by significantly increased NADPH 
and GSH synthesis, NADPH/NADP* and GSH/GSSG ratios and, 
conversely, lower ROS levels and increased resistance to ROS- 
inducing agents (H,O2) (Fig. 3b-e and Extended Data Fig. 6a). 
Mutant Kras was previously associated with enhanced expres- 
sion and activity of the antioxidant programme regulator Nrf2 
(ref. 21) in Kras@!??/7;9537/* MEFs”, potentially explaining the 
increased redox potential of our Kras©!7"’!2).53~/~ MEFs. Despite 
exhibiting comparable Nrf2 expression to heterozygous (Fig. 3f), 
Kras¢!2/612D MEFs showed upregulation of Nrf2-regulated GSH 
utilization genes”, indicating that increased Nrf2-mediated detoxi- 
fication may contribute to their metabolic rewiring. 
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Figure 3 | Mutant Kras copy-number dictates redox state, metabolic 
dependencies and therapeutic susceptibilities. a, Total and phosphorylated 
Pdhela levels and PDH activity in MEFs. b, Cellular ROS (CellRox); 

c, NADPH/NADP* ratio and NADPH levels; d, GSH/GSSG ratio and 

GSH levels in MEFs. e, MEF survival upon 24h HO, treatment. f, Nrf2 
and Nrf2-target gene expression in MEFs (left: quantitative PCR (qPCR); 
right: microarray). Nrf2 targets significantly upregulated in homozygous 
MEFs highlighted (bold red, t-test). g, MEF viability after 72 h culture 

in low glucose (GLC), 2DG or low glutamine (GLN), relative to normal 
media (control). h, Percentage of AnnexinV*/propidium iodide* (AnV/ 
PI-positive, FACS) MEFs upon 48h BSO, 2DG, or combined (2DG + BSO) 
treatment. i, GSH/GSSG ratio and GSH levels in KRAS™“ NSCLC cells. 

j, NSCLC cells treated as in h. Triplicate mean + s.d. shown for three 
independent MEFs per genotype (a, ¢, d, f) or for representative data (three 
independent runs) (b, e, g-j). Data normalized to WT/WT (a, c-f) or HET 
mean (i). One-way (a-f, i) or two-way ANOVA (g, h, j). ***P < 0.001; 

**P < 0.01; *P< 0.05; NS, not significant. 


The metabolic heterogeneity of Kras”™ cells can potentially limit the 
efficacy of generalized targeting approaches”, prompting us to explore 
potential Kras’™’ copy number-dependent susceptibilities. Unlike 
heterozygotes, Kras@!2P/G12D MEFs were very sensitive to low glucose 
levels and the glucose analogue 2-deoxy-p-glucose (2DG), which 
induced a notable apoptotic response (Fig. 3g, h). In turn, Kras¢12P/WT 
MEFs showed higher sensitivity to low glutamine. Confirming a reli- 
ance on glucose for efficient ROS management, Kras@!2P/G12D cells (but 
not Kras“?/“7 or Kras¢!??/W) showed increased ROS levels upon 
2DG treatment, while N-acetyl-L-cysteine (NAC, GSH precursor) 
partly rescued their 2DG-induced apoptosis (Extended Data Fig. 6b, c). 
Moreover, combined 2DG and L-buthionine-S,R-sulfoximine (BSO, 
GSH biosynthesis inhibitor”) treatment induced drastic, Kras@!7)/6!P- 
specific apoptosis and a reduction in GSH/GSSG ratio (Fig. 3h and 
Extended Data Fig. 6d). Likewise, murine and human Kras’ homozy- 
gous lung cancer cells exhibited increased GSH levels, GSH/GSSG 
ratio and enhanced sensitivity to low glucose, 2DG and 2DG/BSO, 
relative to heterozygotes (Fig. 3i, j and Extended Data Fig. 6e-g), 
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and metabolic rewiring in vivo. a, Representative haematoxylin and 

eosin (H&E) sections (scale bar, 201m) and Kras@!”? allelic frequency 
(pyrosequencing) in independent Kras@!?”!*;p53"/* early and late lung 
tumours (n =4 mice per cohort) and normal lung (‘early, ‘late ‘normal, 
respectively). b, Relative abundance of selected '*C-glucose-derived 
metabolites (LC-MS) in samples from a (n= 3 normal, n= 16 early, n= 12 
late). c, Representative imaging and luciferase activity/mouse 3 weeks after 
MFF transplantation (n= 8 per genotype). d, Representative H&E and 
quantification of lung tumours in MEF recipients (t-test). Arrows, lung 
tumours; scale bars, 2mm (large), 250 1m (small). e, Left: luciferase imaging 
of lung cancer cell (L1212, L1211) recipients, 3 weeks after transplantation 
(n=5 per genotype, left). Right: recipient survival (Kaplan-Meier 
log-rank test, n =9 per genotype). f, Ki67* quantification of early and late 
tumours treated for 48h with 2DG + BSO or vehicle (CTRL) (n =3 mice 
per cohort). g, KRAS” TCGA lung adenocarcinoma’ analysis following 
tumour segregation into ‘KRAS™” (mutation (mut)) and ‘KRASBUEC] 
(mutation and copy gain (mut&CG)) cohorts. KRAS copy number/tumour 
shown (upper left). Differential expression of glycolysis and glutathione 
pathway genes illustrated (RNaseq, IPA; bottom left). Glycolytic genes 
significantly upregulated in KRAS™'*©S tumours (bold red, RNaseq) or 
G12D/G12D MEFs (boxes, microarray) relative to heterozygous illustrated 
(right). Mean £s.e.m. (a, f,) or +s.d. (d) shown. a, e, One-way ANOVA. 
***D < 0,001; **P < 0.01; *P< 0.05; NS, not significant. 


revealing a mutant Kras copy-gain-specific susceptibility to glucose 
and glutathione depletion in lung cancer cells. 

Finally, we defined the metabolic impact of Kras copy gain 
in vivo using the spontaneous Kras©!)!*;p53-/~ lung tumours where 
these gains were originally reported’. These tumours progress over 
time from low- to high-grade, with Kras©!?” gains being associ- 
ated with the latter, prompting us to compare glucose flux in early 
(mostly low-grade) and late (typically advanced) tumours. Control 
or tumour-bearing mice were infused with '*C-glucose and normal 
lung or individual tumours isolated for LC-MS analysis and biopsied 
for Kras locus assessment. Early tumours and control lung showed 
similar Kras@!? allelic content (means 46.7% and 46.2% respectively; 
Fig. 4a), demonstrating that early lesions are predominantly 


G12D 
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heterozygous. In contrast, late tumours showed increased Kras@!?? 


allelic prevalence (mean > 50%), confirming mutant enrichment 
in advanced disease’. Importantly, and consistent with our in vitro 
data, late tumours exhibited an increase in glucose-derived TCA cycle 
metabolites, as well as serine, glycine and GSH (Fig. 4b and Extended 
Data Fig. 7). Notably, by showing that early and late tumours have 
distinct metabolic profiles, we provide, to our knowledge, the first 
evidence of in vivo metabolic reprogramming during lung cancer 
malignant progression. 

Kras°!?” copy gain also drove increased malignancy in MEFs and 
lung cancer cells (Fig. 4c—e). Accordingly, Kras¢!2P/G!2P MEFs showed 
a highly penetrant colonization phenotype inducing lung tumours in 
eight out of eight recipient mice following intravenous transplanta- 
tion. In contrast, none of the Kras™"/W" (n =5, data not shown) and 
only one out of eight Kras©!?>/" recipients developed a lung lesion. 
Similarly, Kras@!?P/CP lung cancer cells exhibited a significantly 
increased metastatic potential relative to heterozygous cells, establish- 
ing a direct link between Kras" gains and lung cancer malignancy. 

Lastly, we defined the therapeutic impact of glucose metabo- 
lism rewiring in vivo by treating early and late lung tumours with 
2DG + BSO. Similarly to MEFs and lung cancer cells, late tumours 
(where Kras°'? allele prevalence is increased (Fig. 4a)) were signif- 
icantly more sensitive than early tumours to 2DG + BSO treatment 
(Fig. 4f). Thus, despite the presence of a Kras@!7” allele (and p53 inac- 
tivation) in both groups and their comparable proliferation (Fig. 4f, 
CTRL), low and high-grade lung tumours have distinct and mutant 
Kras copy number-dependent therapeutic susceptibilities. 

Kras allelic imbalance’ and mutant Kras upregulation? were shown 
to select for p53 inactivation and correlate with tumour progression, 
but the oncogenic effects of enhanced mutant Kras signalling remained 
unclear. Here we show that even in the absence of p53, Kras@!2?W7 and 
Kras¢?2P/G12D cells are phenotypically distinct, with mutant Kras copy 
gain driving gain-of-functions that include upregulation and repro- 
gramming of glucose metabolism, enhanced ROS management and 
increased metastatic potential. It is possible that loss of the Kras? 
allele contributes to the phenotypes observed in KrasC?9/6P cel]s?45, 
However, since the majority of advanced murine lung tumours retain 
the wild-type allele (Fig. 4a), we argue that mutant Kras gains are the 
more likely target of positive selection during lung cancer progression. 
In agreement, KRAS WT_loss is uncommon whereas mutant and wild- 
type KRAS gains are frequent features of mutant KRAS human lung 
adenocarcinoma”!”*, 

Importantly and consistent with our findings, TCGA copy number 
variation and RNaseq data analysis of 65 mutant KRAS lung adeno- 
carcinomas’ revealed that, despite likely allelic gain heterogeneity’”, 
combined KRAS mutation and copy gain correlate with glycolysis and 
glutathione metabolism pathway upregulation (KRAS™'®CS, Fig. 4g 
and Extended Data Table 2). These data confirm that mutant KRAS 
lung tumours are not a single metabolic entity and that, similarly to 
murine tumours, they may comprise (at least) two disease subgroups 
with distinct genetic and metabolic signatures and unique therapeutic 
susceptibilities. We argue that this heterogeneity may have contributed 
to the poor treatment responses of KRAS mutant tumours and hence 
that combined quantitative and qualitative KRAS locus assessment may 
have both prognostic and therapeutic utility. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Mice, adenoviral infection and treatments. Animals were maintained 
under SPF conditions and in compliance with UK Home Office regulations. 
Kras'5!-G!2P (ref. 6) mice were crossbred to p53"* (ref. 12) to obtain mixed back- 
ground (C57Bl/6/129/Sv) Kras'S"-612)!+;553'*/F (for spontaneous tumours and 
MEF generation) or Kras”/*,p53*'* (transplantation recipients) mice. Endogenous 
lung tumours were generated through intranasal administration of 8- to 10-week- 
old Kras!S!-G12P!*+.553F*!F* mice (termed Kras¢!?)/ 7.53") with Cre-expressing 
adenovirus (5 x 10’ plaque-forming units per mouse, University of Iowa Vector 
Core), as previously described’. For therapeutic studies, tumour-bearing mice were 
treated 12 (early group) or 16 weeks (late group) after Cre administration with a 
combination of 1,000 mgkg~' 2DG and 10 mmolkg ! BSO or vehicle (saline) once 
a day for 2 days (intraperitoneal). Lungs were collected 24h after the last treatment 
and formalin fixed (of note: 2DG-+ BSO treatment was sometimes associated with 
a temporary decrease in motility in both control and tumour-bearing mice). 

For transplantation studies, 8- to 12-week-old syngeneic wild-type mice were 
sublethally irradiated (4 Gy, caesium source) 6h before tail vein injection with 
1 x 10° cells in 100,11 PBS. Baseline luminescence values were collected 24h after 
transplantation and tumour growth monitored weekly by bioluminescence imag- 
ing after intraperitoneal injection with p-luciferin (150 mg/kg body weight) using 
an IVIS Spectrum Xenogen machine (Caliper Life Sciences). Relative luciferase 
activity corresponds to change from baseline at indicated time point, normalized 
to blank control (luciferase-negative animal). For tumour load analysis, lungs were 
collected 3 weeks after transplantation, whereas tumour survival represents the 
onset of moderate signs of disease. Two independent MEFs per genotype were 
used in transplantation studies (four recipients per MEF line). Lung cancer cell 
lines (L1211 and L1212) were each transplanted onto five (tumour load study) or 
nine recipient mice (survival). All studies involved animals of both sexes and no 
animals were excluded from the analysis. Cohort sizes were calculated on the basis 
of published data’, and pilot studies and animals were randomized on the basis of 
gender and age. Tumour analysis was performed blindly. No tumour exceeded the 
maximum size approved by the animal welfare committee/regulations. 
Generation and culture of MEFs and tumour cell lines. For MEF generation, 
Kras!S!-Gl2D!+.553F/Fx animals were interbred and embryos collected at embry- 
onic day (E)12.5, to overcome Kras'S¥-G!2P/G12P embryonic lethality”’, and Cre- 
mediated recombination performed immediately after MEF generation. In short, 
cells were cultured in DMEM supplemented with 10% FBS, 2mM t-glutamine for 
one passage and then infected with adenovirus-Cre (5 x 10’ plaque-forming units 
per 1 x 10° cells). Cre-mediated recombination was confirmed by PCR. MEF data 
were typically obtained using three independent MEFs per genotype and a mini- 
mum of two independent MEFs per genotype was used in all cases. All (short-term) 
assays were performed in low-passage MEFs (P1-P4 post-Cre). For assessment of 
proliferative capacity, MEFs were cultured under standard 3T3 protocol. Briefly, 
at every passage 3 x 10° cells were plated in triplicate on 6cm plates and counted 
3 days later. Cumulative cell number was calculated as log(Nj/N;)/logo, where N; 
and N¢ correspond to number of cells plated and final counts/passage, respectively. 
Murine lung tumour cell lines were generated from independent, spontaneous ‘late’ 
lung tumours from three Kras'S-612D/+;53R270H/ER (refs 6, 28) mice. Tumour cells 
were dissociated by collagenase/dispase (Roche) treatment and cultured in DMEM/ 
F12 media supplemented with 10% FBS, 2mM t-glutamine. Human NSCLC cell 
lines were recently purchased from ATCC (authenticated) and cultured in RPMI 
media supplemented with 10% FBS, 2 mM 1-glutamine. 

AnnexinV/PI FACS analysis was performed as described”’. For BrdU/PI FACS 

cells were labelled with 10 mM BrdU (Sigma) for 2h. After harvest, cells were 
fixed and stained with FITC-conjugated Anti-BrdU monoclonal antibody (Becton 
Dickinson), according to the manufacturer’s protocol and resuspended in PBS 
containing 201g ml! of propidium iodide (PI). FACS was performed using a LSRII 
(BD) flow cytometer and analysed with FlowJo software (Treestar). Cell viability 
following nutrient deprivation and HO) administration was determined by 
Trypan Blue exclusion (0.4%, Gibco) or Crystal Violet (0.2%, Sigma), respectively. 
For in vivo luciferase imaging, MEFs and tumour cells were transduced with an 
MSCV-luciferase-hygromycin retrovirus and selected (350,1g ml“! hygromycin B) 
before intravenous transplantation. All cells used in this study tested negative for 
mycoplasma contamination. In vitro assays were performed in triplicate and run 
at least three independent times. 
Ras and PDH expression and activity. Ras activation was determined by Raf-GST 
pull-down-based ELISA using 100g protein/sample (whole-cell lysates) (Merck 
Millipore, Ras activation ELISA kit, 17-497). PDH activity was determined with 
a PDH enzyme activity assay kit (ab109902, Abcam), according to the manufac- 
turer’s instructions. Immunoblotting (40g protein per sample) was performed 
with anti-Ras (Cell Signalling, 3339), anti-phospho-Pdhela (Ser 293; AP1062; 
Calbiochem), total Pdhela (9H9AF5; 459400; Life Technologies) or anti-3-tubulin 
(Cell Signalling 2146, loading control) antibodies. 


Gene expression profiling, IPA and qPCR validation. Microarray analysis was 
performed on three independent MEFs per genotype using Illumina MouseWG-6 
version 2.0 Expression BeadChip (Department of Pathology, Cambridge 
University). Normalized log, values were determined and average log fold change 
(logFC) calculated for each comparison. Pathway analysis of genes differentially 
expressed (>1.3-fold) between genotypes was performed using IPA software 
(http://www.ingenuity.com) and statistical significance (P < 0.05) of canonical 
pathways determined by Benjamini-Hochberg multiple testing correction. Relative 
gene expression was depicted by heatmaps generated using GENE-E software and 
statistical significance (P < 0.05) determined by t-test. Gene expression changes 
were validated by qPCR using ROCHE Universal Probe Library System or Life 
Technologies probes and all data normalized to 18S expression. 
Immunohistochemistry. H&E and Ki67 (Bethyl Laboratories, IHC-00375) immu- 
nohistochemistry was performed on formalin-fixed, 51m paraffin-embedded tis- 
sue sections. For transplantation studies, the total number of tumours in a single 
representative H&E section per animal (minimum of four lung lobes per section) 
is shown. For endogenous tumours, the percentage of proliferating cells was deter- 
mined as Ki67*/DAPI* nuclei per tumour from a single representative section per 
animal (minimum of four lung lobes per section)). All cells or a minimum of 4,000 
(and >50% coverage) DAPI* nuclei per tumour were counted from an average of 
5.25 tumours per mouse. 

Extracellular flux profiling. Oxygen consumption rate and ECAR levels were 
determined using a Seahorse XF°24 analyser. Twenty thousand MEFs or 4 x 104 
tumour cells were plated in Seahorse XF°24 assay plates. Immediately before anal- 
ysis, media was replaced by bicarbonate-free DMEM (Sigma) supplemented with 
143 mM NaCl, 2% FBS and, where appropriate, 25 mM Glucose, 4mM L-glutamine, 
pH 7.4 and cells incubated at 37 °C for 30 min in a CO>-free incubator. Each cycle 
of measurement involved 3 min mixing, 3 min waiting and 3 min measuring. After 
baseline measurements, testing agent prepared in assay medium was injected and 
followed by subsequent measuring cycles. Glycolysis stress test: measurement 1-3, 
basal (no glucose); 4-6, glucose (10 mM); 7-9, complex V inhibitor oligomycin 
(1 1M); and 10-12, 2-deoxyglucose (2DG, 100 mM). Mitochondrial stress test: 
measurement 1-3, normal (basal + 25mM glucose); 4-6, oligomycin (11M); 7-9, 
carbonyl cyanide m-chlorophenylhydrazone (CCCP, 500 nM); and 10-12, rotenone 
(1M). Protein content (BCA assay, Thermo Fisher) at endpoint or cell number 
was used for data normalization. 

Metabolomics analysis. In vitro sample preparation. Five hundred thousand MEFs 
or tumour cells were supplemented with media containing uniformly labelled 
13C-glucose (25 mM) or glutamine (4mM) for 4h before sampling. Metabolites 
were extracted from media (extracellular) and cell pellet (intracellular) in 50% 
MeOH: 30% AcetoNitrile: 20% H2O + 100ngml~! HEPES buffer (1 ml per 10° 
cells). Samples were incubated at 4°C for 15 min at 700r.p.m., before centrifu- 
gation at 13,000 r.p.m. (15,700g). Supernatant was transferred to vials for mass 
spectrometry analysis. 

In vivo sample preparation. Kras' mice were anaesthetized 
(isofluorane inhalation) and administered a bolus of 0.4mgg_' body weight 
13C-glucose by tail vein intravenous injection before continuous infusion of 
0.012mgg-!min™! at 150,1h~! for 3h. Normal lungs and independent lung 
tumours were collected and snap frozen. Samples were transferred to Precellys24 
tubes, metabolite extraction buffer (as above) was added (25011 per 10mg), sam- 
ples homogenized, centrifuged at 13,000 r.p.m. (15,700 g) and supernatant used for 
mass spectrometry analysis. Before analysis, tissues were biopsied for Kras allelic 
assessment (pyrosequencing). 

LC-MS metabolomics. Sequant Zic-pHilic (150mm x 2.1mm, internal 

diameter 5,1m) column and guard column (20 mm x 2.1 mm, internal diameter 
5m) from HiChrom were used for LC separation. Mobile phase A: 20 mM 
ammonium carbonate plus 0.1% ammonia hydroxide in water. Mobile phase 
B: acetonitrile. Flow rate was maintained at 180,11 min! and the gradient was 
as follows: 0-1 min 70% of B; 16 min 38% of B; 16.5 min 70% of B; 25 min 70% 
of B. A mass spectrometer (Thermo QExactive Orbitrap) was operated in full 
MS and polarity switching mode. Samples were randomized to avoid machine 
drifts and run in triplicate. Spectra were analysed using XCalibur Qual Browser 
and XCalibur Quan Browser softwares (Thermo Scientific) by referencing to an 
internal library of compounds. Relative metabolite abundance was calculated 
as the percentage of the indicated isotopologue to the total pool size of that 
metabolite, and depicted graphically or as a heatmap using GENE-E software. 
Samples were run and processed blindly. 
Glucose consumption and lactate measurements. For glucose consumption anal- 
ysis, | x 10° MEF or tumour cells were incubated for 1 h with fluorescent 6-NBDG 
(N-23106, Life Technologies) and analysed by FACS (percentage of NBDG-positive 
cells). Lactate production was assessed 48 h after plating using Lactate Reagent 
(Trinity Biotech, Ireland), according to the manufacturer's instructions. Total 
lactate was normalized to cell number. 


GI2D/WT. 9 5 3Fx/Ex 
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TCGA data set analysis. Lung adenocarcinoma patient TCGA data! (no restric- 
tions) was downloaded from c-Bioportal and KRAS mutation and copy number 
assessed. Tumours with a KRAS mutation and KRAS GISTIC score of 0+ were 
taken forward for analysis (n = 65). Samples were divided into two cohorts 
based on GISTIC classification: KRAS™ (mutation only, GISTIC = 0, n = 36) or 
KRAS™®CS (mutation and copy gain, GISTIC = 1&2, n= 29). KRAS copy number 
was calculated (2°NY x 2) from SNP6 data. Pathway analysis was performed on the 
basis of RNaseq' metabolic gene expression data (1,430 genes) using IPA software, 
as described above. 

Reagents. Metabolic probes: 6-NBDG (3011.M), Mitotracker Green (50nM), 
TMRM (50nM), NAO (200 nM) and Cell Rox deep red (511M) were obtained 
from Life Technologies. Cellular ROS was determined using CellRox deep red 
probe (Life Technologies, C10422) and FACS analysis. GSH/GSSG and NADPH/ 
NADP levels and ratios were calculated according to the manufacturer's instruc- 
tions (Promega, V6611 and G9081, respectively). Normal and low GLC and GLN 
(Sigma) correspond to 25 mM and 5mM, and 4mM and 0.5 mM, respectively. Cells 
were treated with the following agents, either alone or in combination as indicated: 
100 4M H20 2, 10 mM 2DG and 2 mM (tumour cells) or 5mM BSO (MEFs). Total 
protein content was assessed using Sulforhodamine B (Sigma, 0.057% w/v), and 
total RNA content extracted with TRIzol (Life Technologies). NAC (4mM, Sigma) 
was added to cells daily. 

Pyrosequencing analysis. Genomic DNA (from MEFs, tumour cells and tumours) 
was isolated and the Kras WT/mutant allelic ratio determined by pyrosequencing 
(PyromarkQ24, Qiagen) according to the manufacturer’s instructions. Pyrograms 
were analysed (PyroMarkQ24 software) and WT and mutant Kras allelic frequency 
determined on the basis of standards (WT:mutant ratios 2:0, 1:1, 0:2) and shown 
as the percentage of total Kras content. 

Primers and probes. Genotyping: Kras’S“°/2P (ref. 6) and p53" (ref. 12) 
alleles were genotyped as reported. Microarray validation primers and probes 
were as follows. (UPL library, Roche): GAPDH: forward 5’-GGGTTCCTATA 
AATACGGACTGC-3’, reverse 5’-CCATTTTGTCTACGGGACGA-3’; Probe 52; 
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Slc2a1: forward 5’-GGATCCCAGCAGCAAGAAG-3’, reverse 5’/-CCAGTGTTA 
TAGCCGAACTGC-3’; Probe 76; Pfkl: forward 5’-GGGTCATGTACA 
GCGAGGA-3’, reverse 5'/-GGCCTCCATACCCATCTTG-3’; Probe 41; Enol: 
forward 5'‘-GAGGACACTTTCATCGCAGAC-3’, reverse 5’-CCAGCTCTTC 
CTCAATTCTGA-3’; Probe 77; Nrf2: Mm00477784_m1, Ldha: Mm01612132_m1 
(Life Technologies), 18S: 4352930E (Thermo Fisher). Pyrosequencing: (Qiagen Q24 
pyrosequencing assay) mKras©!” forward 5‘-GTAAGGCCTGCTGAAA 
ATGACTGA-3/; mKras@!”” reverse 5/-[Btn] IATCGTCAAGGCGCTCTTGC-3/, 
mKras©!”? sequencing primer 5’-TGAAAATGACTGAGTATAAA-3’. 
Statistical analysis. Data were visualized and statistical analyses performed 
using Prism 5.0 software (Graph Pad) or R statistical package. P< 0.05 was 
considered statistically significant. In all cases, experimental groups showed 
comparable variance. P values for unpaired comparisons between two groups 
with comparable variance were calculated by two-tailed Student’s t-test. One- 
way ANOVA (all groups against wild-type (WT) group) was used for anal- 
ysis between three or more groups with comparable variance, followed by 
Bonferroni post-test for individual comparisons. Ordinary two-way ANOVA 
(treatment groups against individual genotype control group with compara- 
ble variance) was used for analysis that involved two variables, followed by 
Bonferroni post-test for individual comparisons. Kaplan-Meier comparison 
was used for analysis of survival cohorts. Pearson's correlation analysis was used 
to compare relationships between variables in groups with similar distribution. 
TCGA gene expression data were analysed using a negative binomial general- 
ized linear model (DESeq2). *P < 0.05; **P < 0.01; ***P < 0.001. Error bars, 
mean +s.d. or s.e.m., as indicated. 


27. Johnson, L. et al. K-ras is an essential gene in the mouse with partial functional 
overlap with N-ras. Genes Dev. 11, 2468-2481 (1997). 

28. Christophorou, M. A. et al. Temporal dissection of p53 function in vitro and 
in vivo. Nature Genet. 37, 718-726 (2005). 

29. Martins, C. P., Brown-Swigart, L. & Evan, G. |. Modeling the therapeutic efficacy 
of p53 restoration in tumors. Cel/ 127, 1323-1334 (2006). 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | Enhanced glycolysis in homozygous Kras©!?> 


cells. a, Representative data (n= 3) of PCR analysis of the Kras and p53 
loci in KrasW"/WT (WT/WT), Kras@!?9/WT (G12D/WT) and Kras@!2/612D 
(G12D/G12D); pos MEFs after Cre-mediated recombination; and of 
unrecombined Kras!S!-G12D/WT.p 5 3/0xP/loxP control (Cre-) (*background 
band). b, IPA analysis of canonical pathways significantly altered in 
Kras@!?/G2D relative to Kras@!?>/WT MEF transcriptomes (n =3 per 
genotype). c, Representative qPCR data (n= 3) of glycolytic gene 
expression in MEFs. Fold change relative to WT/WT shown as triplicate 
mean + s.d. (one-way ANOVA). d, ECAR in MEFs following exposure to 
glucose, oligomycin and 2DG. Representative data from three independent 
MEFs per genotype show mean value between triplicates + s.d. (two-way 
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ANOVA). e, Kras locus analysis of two lung cancer cell lines (L1212 and 
L1211) generated from spontaneous tumours from Kras@!2/“7T;53- 
deficient mice. PCR (top) and pyrosequencing (bottom) analysis shown 
(L1212: Kras heterozygous, G12D/WT; L1211: G12D homozygous, G12D/ 
G12D). Recombined heterozygous MEFs shown as PCR control (CTRL). 

f, Representative qPCR data (n = 3) of glycolytic gene expression in L1211 
and L1212 lung tumour cells. Fold change relative to heterozygous cells 
shown (mean of triplicates + s.d.; ***P = <0.001; *P < 0.05; t-test). g, Left: 
basal glucose consumption in murine lung tumour cells determined by 
FACS analysis of 6-NBDG uptake (%). *P = 0.02; t-test. Right: extracellular 
lactate concentration (ng/dl/cell) in murine lung tumour cells. Data are 
triplicate mean + s.d. *P = 0.0139; t-test. 
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Extended Data Figure 2 | Kras©!22/7 and Kras©!?P/G12D MEFs have 
similar biomass and mitochondrial functionality. a, Total protein 
content in indicated MEFs relative to WT/WTT. b, Total RNA per cell for 
each of the indicated genotypes. c, d, WT/WT, G12D/WT and G12D/ 
G12D MEFs were profiled by CASY counter (Roche) and cell volume (c) 
and diameter (d) measured. a-d, Mean value of three independent MEF 


triplicates per genotype + s.d. e, Oxygen consumption rate (OCR) of MEFs 


in response to oligomycin, CCCP and rotenone (two-way ANOVA). 


f, NAO staining was used to determine mitochondrial mass in Kras 


WT/WT 


(WT/WT), Kras@!2?/WT (G12D/WT) and Kras@!2P/G!2P (G12D/G12D) 
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MEFs. Geometric mean of NAO fluorescence in cells was determined 
by FACS. Representative overlay (left panel) and geometric mean (right 
panel) displayed. g, Mitochondrial architecture was examined after 
Mitotracker green staining in WT/WT, G12D/WT and G12D/G12D 


MEFs (scale bar, 101m). h, TMRM 


staining was used to determine 


mitochondrial membrane potential in MEFs of indicated genotypes. 
Geometric mean of TMRM fluorescence in cells was determined by 
FACS. Representative overlay (left panel) and geometric mean (right panel) 
displayed. e-h, Representative data of three independent MEFs per 


genotype show mean of triplicates 4 
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E s.d.; ***P < 0.001; one-way ANOVA. 
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Extended Data Figure 3 | Glucose metabolism reprogramming 

in Kras@!22/G122D MEFs. aj, Measurement of !°C-glucose-derived 
metabolites, calculated as a percentage of the total metabolite pool 
following LC-MS analysis of WT/WT, G12D/WT and G12D/G12D MEFs 


after 4h culture with '°C-glucose-supplemented media. Representative 
data (of two independent MEFs per genotype) showing mean of 
triplicates + s.d.; ***P < 0.001; **P < 0.01; *P < 0.05 (two-way ANOVA). 
Undetected isotopologues not shown. 
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Extended Data Figure 4 | Glucose metabolism reprogramming in 
lung tumour cells with mutant Kras copy gain. Measurement of 
13C-glucose-derived metabolites, calculated as a percentage of the 

total metabolite pool following LC-MS analysis of murine (L1211 and 
11212, a-h) and human (H23, H358, H460, SW1573, i-p) mutant Kras 
heterozygous and homozygous lung tumour cells. Cells were cultured 


for 4h with '¥C-glucose-supplemented media before analysis. Data show 
mean of triplicates + s.d.; ***P < 0.001; **P< 0.01; *P < 0.05 (two-way 
ANOVA, relative to Kras"™ heterozygous cells; i-p, homozygous samples 
significantly different from both heterozygous cell lines indicated). 
Undetected isotopologues not shown. 
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Extended Data Figure 5 | Kras©!?2/W7 and Kras©!2P/G12D MEBs have 
distinct glutamine metabolism profiles. Glutamine metabolism analysis 
in WT/WT, G12D/WT and G12D/G12D MEFs. a, Representation of 
carbon flux (grey circles) from uniformly labelled '*C-glutamine (7C-GLN). 
b, Heatmap illustrates abundance of selected labelled metabolites 

across triplicates of representative MEFs (two independent MEFs per 
genotype analysed) based on metabolomics analysis. c-i, Measurement of 
'3C-glutamine-derived metabolites, calculated as a percentage of the total 
metabolite pool following LC-MS analysis of WT/WT, G12D/WT and 
G12D/G12D MEFs after 4h culture with °C-glutamine-supplemented 


media. Representative data (two independent MEFs per genotype) show 
mean of triplicates + s.d. (two-way ANOVA). j, Oxygen consumption rate 
(OCR) of WT/WT, G12D/WT and G12D/G12D MEFs upon glutamine 
(4mM) addition. Representative data of three independent MEFs 

per genotype showing mean of triplicates + s.d. k, Relative diversion 
(percentage) of glutamine to TCA (aKG m+ 5) or GSH (GSH m-+ 5) in 
MEFs of indicated genotypes based on metabolomics data. Representative 
MEF data (n= 2 MEFs per genotype) show triplicate mean + s.d. (one-way 
ANOVA). ***P < 0.001; **P < 0.01; *P< 0.05. 
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Extended Data Figure 6 | Kras®!?? homozygous cells depend on glucose _ genotype presented. Mean data for triplicates + s.d. shown (two-way 
metabolism reprogramming for ROS management. a, GSSG levels in ANOVA). e, Representative data of GSH/GSSG ratio and GSH levels 
G12D/WT and G12D/G12D MEFs relative to WT/WT. Mean data (n =3 in murine G12D/G12D tumour cells relative to G12D/WT (t-test). 
MEFs per genotype) + s.d. shown. b, ROS levels in MEFs following 48h f, Differential sensitivity of lung tumour cells to nutrient depletion. Lung 
of 2DG treatment. Data were normalized to vehicle treatment (CTRL). tumour cells were cultured in normal media and low glucose conditions 
c, Percentage of AnnexinV/PI double-positive G12D/G12D MEFs for 72h and viable cells counted and normalized to CTRL (two-way 
following 48 h of 2DG treatment in the presence (+) or absence (—) ANOVA). g, Percentage of AnnexinV/PI double-positive murine tumour 
of NAC. d, Ratio of reduced to oxidized glutathione (GSH/GSSG) cells following 48 h treatment with BSO, 2DG, or both (2DG+ BSO). 
determined for WT/WT, G12D/WT and G12D/G12D MEFs after e-g, Representative data (n = 3 independent experiments) depict triplicate 
incubation with 2DG, BSO or both (2DG + BSO) for 48h, normalized mean +s.d. (***P < 0.001, two-way ANOVA). ***P < 0.001; **P < 0.01; 
to vehicle. b-d, Representative data from three independent MEFs per *P<0.05. 
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Extended Data Figure 7 | Increased mutant Kras allelic content leads 
to glucose metabolism reprogramming in lung tumours in vivo. 


a-i, Control (no Cre) and tumour-bearing Kras 


infused with '°C-glucose 12 (early group) or 16 weeks (late group) after 
adenoviral-Cre treatment and individual lung tumours (early, n = 16; 
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late, n= 12) or control lung (normal, n =3) collected for LC-MS analysis 
(three technical replicates per sample). Selected '3C-glucose-derived 
metabolites shown, calculated as a percentage of the total metabolite 

pool. Mean abundance per cohort +s.e.m. shown. ***P < 0.001; *P<0.05 
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Extended Data Table 1 | Mutant heterozygous and homozygous KRAS NSCLC cell lines 
KRAS Heterozygous KRAS Homozygous 
Cell line KRAS Mutation Cell line KRAS Mutation 


NCI-H23 p.G12C NCI-H460 p.Q61H 
NCI-H358 p.G12C SW1573 p.G12C 


KRAS mutation and zygosity of four NSCLC cell lines according to COSMIC Cell Lines project database (version 73). Genotypes were confirmed by Sanger sequencing (data not shown). 
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Extended Data Table 2 | KRAS status in panel of human lung adenocarcinomas 


KRAS™ KRAS™#&ce 
ID KRAS KRAS KRAS ID KRAS KRAS KRAS 
mutation GISTIC CNV mutation GISTIC CNV 
TCGA-78-7539-01 G12C 0 -0.065 TCGA-95-7567-01 G12V 1 0.135 
TCGA-49-6761-01 G12D 0 -0.032 TCGA-55-7726-01 G12C 1 0.34 
" TCGA-55-7281-01 G12. 0 0.012. " ‘TCGA-64-5775-01 Q61L 11.75 
TCGA-05-4430-01 G12C 0 0.056 
TCGA-49-6744-01 G12C 0 0.058 
TCGA-73-4659-01 G12V 0 0.069 
TCGA-50-5936-01 G12C 0 0.088 
TCGA-78-7167-01 G12F 0 0.093 
TCGA-55-6642-01 G12V 0 0.099 
TCGA-44-6145-01 G12V 0 0.099 


KRAS mutation, GISTIC score and copy number variation (CNV) in TCGA lung adenocarcinoma data set!. KRAS mutation (status and nucleotide substitution), KRAS putative copy number calls from 
GISTIC 2.0 analysis (GISTIC score) and KRAS copy number variation (Affymetrix SNP6) were downloaded from cBioportal. Mutant KRAS tumours with a GISTIC score of 0* (n=65) were divided into 
two cohorts: KRAS mutant (KRAS™*, GISTIC = 0, n=36) and KRAS mutant and copy gain (KRAS™®CS, GISTIC = 1 & 2, n= 29), as displayed. 
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Cryo-electron microscopy structure of a 
coronavirus spike glycoprotein trimer 


Alexandra C. Walls!*, M. Alejandra Tortorici?**, Berend-Jan Bosch**, Brandon Frenz!, Peter J. M. Rottier*, Frank DiMaio!, 


Félix A. Rey*? & David Veesler! 


The tremendous pandemic potential of coronaviruses was 
demonstrated twice in the past few decades by two global outbreaks 
of deadly pneumonia. Entry of coronaviruses into cells is mediated 
by the transmembrane spike glycoprotein S, which forms a trimer 
carrying receptor-binding and membrane fusion functions’. 
S also contains the principal antigenic determinants and is the 
target of neutralizing antibodies. Here we present the structure 
of a mouse coronavirus S trimer ectodomain determined at 4.0A 
resolution by single particle cryo-electron microscopy. It reveals 
the metastable pre-fusion architecture of S and highlights key 
interactions stabilizing it. The structure shares a common core with 
paramyxovirus F proteins”, implicating mechanistic similarities 
and an evolutionary connection between these viral fusion proteins. 
The accessibility of the highly conserved fusion peptide at the 
periphery of the trimer indicates potential vaccinology strategies 
to elicit broadly neutralizing antibodies against coronaviruses. 
Finally, comparison with crystal structures of human coronavirus 
S domains allows rationalization of the molecular basis for species 
specificity based on the use of spatially contiguous but distinct 
domains. 

Coronaviruses are enveloped viruses responsible for 30% of mild 
respiratory infections and atypical pneumonia in humans worldwide’. 
The emergence of the severe acute respiratory syndrome coronavirus 
(SARS-CoV) in 2002 and of the Middle East respiratory syndrome 
coronavirus (MERS-CoV) in 2012 demonstrated that these zoonotic 
viruses can transmit to humans from various animal species, and sug- 
gested that additional emergence events are likely to occur. The fatality 
rate of SARS-CoV and MERS-CoV infections are about 10-37%!4 and 
there are no approved antiviral treatments or vaccines. 

Coronaviruses use S homotrimers to promote cell attachment and 
fusion of the viral and host membranes. S determines host range, cell 
tropism and is the main target of neutralizing antibodies during infec- 
tion'. S is a class I viral fusion protein synthesized as a single chain 
precursor of about 1,300 amino acids that trimerizes upon folding. 
It is composed of an amino-terminal S, subunit, containing the 
receptor-binding domain, and a carboxy-terminal S, subunit, driv- 
ing membrane fusion. Cleavage by furin-like host proteases at the 
junction between S, and S2 (S2 cleavage site) occurs during biogenesis 
for some coronaviruses such as mouse hepatitis virus (MHV, the 
prototypical and best-studied coronavirus)!°. The S, and S$, subunits 
remain non-covalently associated in the metastable pre-fusion 
S trimer. After virion uptake by target cells, a second cleavage is mediated 
by endo-lysosomal proteases (S>' cleavage site), allowing fusion activation 
of coronavirus S proteins®. 

Crystal structures of coronavirus S post-fusion cores demonstrated 
that the fusogenic conformational changes lead to the formation of 
a so-called trimer of hairpins that is the hallmark of class I fusion 
proteins’~'°. These structures contain two heptad-repeat (HR) regions 


present in S, assembled as an extended triple helical coiled-coil motif 
(HR1) surrounded by three shorter helices (HR2). Crystal structures of 
several coronavirus S receptor-binding domains in complex with their 
cognate receptors have also been reported''~'*. Finally, cryo-electron 
microscopy (cryoEM) of SARS-CoV virions provided a snapshot of 
the S glycoprotein at 16 A resolution’®. The lack of high-resolution data 
for any coronavirus S trimer has prevented a detailed analysis of the 
infection mechanisms. 

We produced an MHV S ectodomain trimer with enhanced stability 
by mutating the S, cleavage site and fusing a GCN4 trimerization motif 
at the C-terminal end of the construct. The resulting MHV S ecto- 
domain forms a trimer binding with high-affinity to the soluble mouse 
CEACAM 1a receptor (Extended Data Fig. 1a, b). We used state-of-the 
art cryoEM'* to determine the structure of the MHV S ectodomain 
trimer at 4.0 A resolution (Fig. la~c and Extended Data Figs 2 and 3). 
We fitted the crystal structures of two S; domains!!!” and built 
de novo the rest of the polypeptide chain using Coot!® and Rosetta!?”° 
(Fig. 1d-f, Extended Data Figs 2-4 and Supplementary Tables 1 and 2). 
The final model includes residues 15 to 1118, with an internal break 
corresponding to a loop immediately upstream from the S,' cleavage 
site (residues 827-863). The region connecting the S; and S, subunits 
(residues 718-754) features weak density that correlates with its accessi- 
bility for proteolytic cleavage in vivo. Residues 453-535 were modelled 
by density-guided homology modelling using Rosetta owing to the 
poor quality of the density in this region (Extended Data Fig. 3k). 

The MHV S ectodomain is a 140 A long trimer with a triangular 
cross-section varying in diameter from 70 A, at the membrane proximal 
base, to 140A at the membrane distal end (Fig. 1d, e). The structure 
comprises two functional subunits (Fig. 2a—d): a distal moiety con- 
stituted by the S; subunits; and a central stem connecting to the viral 
membrane formed by the S, subunits. 

The S; subunit has a “V’ shape contributing to the overall triangular 
appearance of the S trimer (Extended Data Fig. 5a). The S; N-terminal 
moiety comprises domain A, which is folded as a galectin-like 8-sandwich 
decorated with extended loops on the viral membrane distal side, and 
a three-stranded antiparallel 8-sheet plus an a-helix on the viral mem- 
brane proximal side. The S; C-terminal half folds as three spatially 
distinct 3-rich domains, termed B, C and D (Fig. 2a-d). 

The S, subunit connects to the viral membrane and is characterized 
by the presence of long a-helices (Figs 2b-d and 3a). A central helix 
(a3) stretches 75 A along the three-fold molecular axis towards the 
viral membrane (Fig. 3a). It is located immediately downstream of the 
HRI motif, which folds as four consecutive a-helices (c¢—-C9; Fig. 3a 
and Extended Data Fig. 6a, b), in sharp contrast to the 120-A long 
HRI helix observed in the post-fusion $ structures’? (Extended Data 
Fig. 6c-e). The 55-A-long upstream helix (a29), so named because it 
is located immediately upstream of the S,’ cleavage site, runs parallel 
to and is zipped against the central helix via hydrophobic contacts 
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largely following a heptad-repeat pattern. A core antiparallel 3-sheet 
(846-849-850) is present at the viral membrane proximal end and is 
assembled from an N-terminal 8-strand (846), preceding the upstream 
helix, and a C-terminal 8-hairpin (849-650), located downstream of 
the central helix. 

MHYV S; features a topology similar to the paramyxovirus F proteins 
(such as respiratory syncytial virus (RSV) F: root mean squared devi- 
ation (r.m.s.d.) 4A over 125 residues), with a comparable 3D organ- 
ization of the core 3-sheet, the upstream helix and the central helix 
(Fig. 3a, b). Importantly, these motifs were shown to remain invariant 
in the pre- and post-fusion F structures”, The conservation of these 
motifs among coronavirus S and paramyxovirus F proteins suggests that 
these fusion proteins have evolved from a distant common ancestor. 
Although the density is too weak to trace the polypeptide chain down- 
stream from B59, secondary structure predictions suggest that the domain 
directly preceding HR2 could adopt a similar fold in coronavirus S 
and paramyxovirus F proteins. 

In the S trimer, the three central helices are packed via their cen- 
tral portions whereas the two ends splay away from the three-fold 
axis (Extended Data Fig. 7a—c). Additional contacts between the 


a S, subunit 


S, subunit 
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Figure 1 | 3D reconstruction of the MHV S trimer determined 
by single-particle cryoEM. a-c, 3D map filtered at 4.0A 
resolution coloured by protomer. Two different views of the 

S trimer (from the side (a) and from the top, looking towards the 
viral membrane (b)), and a side view of one S protomer (c) are 
shown. d-f, Ribbon diagrams showing the MHV S atomic model 
oriented as in a-c. 


upstream and central helices participate to inter-protomer interac- 
tions. Furthermore, the S; subunits interlock to form a crown around 
the S, trimer stabilizing it in the pre-fusion conformation (Fig. 3c, d 
and Supplementary Table 3). This is illustrated by the large surface 
area buried at the interface between each S; subunit and the S, subu- 
nits of the three protomers (1,970 A?). Many of these contacts involve 
the HR1 helices and the fusion peptide region. These polypeptide seg- 
ments undergo major refolding during the fusogenic conformational 
changes (Extended Data Fig. 6a—e), which supports the notion that 
the S; subunits maintain the S, fusion machinery in its metastable 
state. Substitutions of the conserved alanine 994 by valine in helix 
2 or of the conserved leucine 1062 by phenylalanine in the cen- 
tral helix were shown to attenuate fusogenicity”)””. Our structure 
suggests that the former substitution would strengthen hydrophobic 
packing against the core 3-sheet (Extended Data Fig. 7b), and that 
the later substitution could reinforce molecular stapling of the cen- 
tral helices (Extended Data Fig. 7a, c). The expected modification 
of the energy landscape between pre-fusion and post-fusion con- 
formations would explain the reduction in fusion activity of these 
mutants”)?, 


Figure 2 | Architecture of the MHV S protomer. 
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1036- a, Schematic diagram of the S glycoprotein 
organization. Black and grey dashed lines denote 
regions unresolved in the reconstruction and 
regions that were not part of the construct, 
respectively. BH, 3-hairpin (849-850); CH, central 
helix; CT, cytoplasmic tail; FP, fusion peptide; 
HR1/HR2, heptad-repeats; TM, transmembrane 
domain; UH, upstream helix. b-d, Ribbon 
diagrams depicting three views of the S protomer 
coloured as in a. Asterisk denotes the MHV S 
receptor-binding region. Disulfide bonds are 
shown as green sticks except for residues 453-535, 
for which they are not shown. 
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The predicted fusion peptide includes the C-terminal half of helix 
Qy) and extends up to the N-terminal half of «2 (refs 6 and 23) (Fig. 2c). 
Q 1 is an amphipathic helix located at the periphery of the S trimer, 
burying hydrophobic side chains towards the S) centre and exposing 
charged residues to solvent (Fig. 2c and Extended Data Fig. 7b, c). In 
the case of porcine epidemic diarrhoea coronavirus, trypsin processing 
at the S,’ site can only occur after host cell attachment™. This indicates 
that receptor binding could allosterically increase the accessibility of 
the S,' site, which is located within helix «;. The acidic pH of the endo- 
lysosomes could also contribute to exposing the S,’ cleavage site for 
coronaviruses requiring cleavage in this compartment. The fact that 
helix a2; appears dynamic and is found immediately downstream from 


Conserved 


Variable 


Figure 4 | Potential strategy for neutralizing coronavirus infections. 
a, Surface representation of the MHV S trimer coloured according to 
sequence conservation using the alignment presented in Extended 
Data Fig. 9. The fusion peptide sequence is highly conserved among 
coronavirus S proteins. b, Surface representation of the MHV S trimer 
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Figure 3 | Pre-fusion structure of the coronavirus 
fusion machinery. a, b, Topology and ribbon 
diagrams showing the structural similarity 
between coronavirus MHV S; (starting at residue 
755) (a) and paramyxovirus RSV F (PDB 5C6B) 
(b). For clarity, only part of RSV F is shown, with 
conserved secondary structural elements coloured 
identically as for MHV So. ‘# denotes motifs 
participating to the post-fusion HR1 coiled-coil. 
c, d, Two different views of the MHV S trimer 
(from the side (c) and top, looking towards the 
host cell membrane (d)) highlighting how S, 
(ribbon diagram and semi-transparent surface) 
wraps around the S, fusion machinery (ribbon 
diagram) to stabilize it. 


a disordered loop suggests that it could undergo considerable ‘breathing’ 
motions. Regardless of the mechanism promoting cleavage, the MHV S 
structure reported here explains the requirement for processing at the 
S,’ site, as it frees the fusion peptide from the S, N-terminal region, 
which is a prerequisite for its insertion ~200 A away in the target mem- 
brane. The peripheral position of the fusion peptide is similar to what 
has been observed in the parainfluenza virus 5 F? and HIV gp41 (ref. 25) 
prefusion structures (Extended Data Fig. 8a-c). The notable accessibility 
of the fusion peptide and its sequence conservation among corona- 
viruses®” suggest that it would be an ideal target for epitope-focused 
vaccinology initiatives aimed at raising broadly neutralizing antibodies 
against S glycoproteins (Fig. 4a—c and Extended Data Fig. 9). Major 


” SARS-CoV/MHV.~ 


major 
antigenic 
determinant 


highlighting the peripheral position of the fusion peptide (blue and 

cyan). c, Ribbon diagrams of the MHV S trimer showing the overlapping 
positions of the fusion peptide (residues 870-887, blue and cyan) and of a 
major antigenic determinant identified for MHV and SARS-CoV (residues 
875-905, magenta spheres). 
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antigenic determinants (inducing neutralizing antibodies) of MHV 
and SARS-CoV S proteins overlap with the fusion peptide region and 
support the suitability of this approach”*?’. Antibodies binding to this 
site will not only hinder insertion of the fusion peptide into the target 
membrane, but will also putatively prevent fusogenic conformational 
changes. This epitope-focused strategy has proven successful to obtain 
neutralizing antibodies against RSV F*. 

The spatial proximity of domains A and B in the S trimer allows ration- 
alization of their alternative use among coronaviruses to interact with 
host receptors. MHV uses the viral membrane distal loops decorating 
domain A to interact with CEACAM 1a (ref. 13), whereas MERS-CoV 
and SARS-CoV rely on the B-motif protruding from domain B to bind 
to DPP4 (ref. 11) or ACE2 (refs 12 and 14), respectively (Extended 
Data Fig. 5a—d). The poor sequence conservation of the B domain 
8-motif among coronavirus S proteins, its considerable length variation 
among MHV strains (Extended Data Fig. 9) and our density-guided 
homology model of this motif indicate structural and functional differ- 
ences. These structural variations constitute the molecular basis under- 
lying coronavirus species specificity and cell tropism using a single S 
architectural scaffold. 

Sequence comparisons indicate that the MHV spike S; and S2 sub- 
units respectively share ~25% and ~40% sequence similarity with 
many other coronavirus S proteins (Extended Data Fig. 9). Therefore, 
the structure reported here is representative of the architecture of other 
coronavirus S such as those of MERS-CoV and SARS-CoV. This hypoth- 
esis is further supported by the structural similarity of (1) the MHV 
and bovine coronavirus!” A domains; (2) the MHV, MERS-CoV", 
SARS-CoV” and HKU4 (ref. 29) B domains (Extended Data Fig. 10); 
(3) the post-fusion cores of MHV’, SARS-CoV®*"" and MERS-CoV”; and 
(4) the isolation of infectious coronaviruses featuring a deletion of the 
A domain and using domain B as the receptor-binding domain*”. Our 
results now provide a framework to understand coronavirus entry and 
suggest ways for preventing or treating future coronavirus outbreaks. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 
Plasmids. A human codon-optimized gene encoding the MHV spike gene 
(UniProt: P11224) was synthesized with an Arg717Ser amino acid mutation 
to abolish the furin cleavage site at the S;-S junction (S2 cleavage site). From 
this gene, the fragment encoding the MHV ectodomain (residues 15-1231) was 
PCR-amplified and ligated to a gene fragment encoding a GCN4 trimerization 
motif (IKRMKQIEDKIEEIESKQKKIENEIARIKKIK)*”", a thrombin cleavage 
site (LVPRGSLE), an 8-residue long Strep-Tag (WSHPQFEK) and a stop codon. 
This construct results in fusing the GCN4 trimerization motif in register with the 
HR2 helix at the C-terminal end of the MHV S-encoding sequence. This gene was 
cloned into the pMT/BiP/V5/His expression vector (Invitrogen) in frame with 
the Drosophila BiP secretion signal downstream the metallothionein promoter. 
The D1 domain of mouse CEACAM1a (residues 35-142; gb NP_001034274.1) 
was amplified by PCR and cloned into a mammalian expression plasmid, in 
frame with a CDS signal sequence at the 5’ end, and with a sequence encoding a 
thrombin cleavage site, a glycine linker and the Fc domain of human IgG] at the 
3’ end, creating the pCD5-MHVR-T-Fc vector. 
Production of recombinant CEACAM 1a ectodomain by transient transfection. 
293-F cells were grown in suspension using FreeStyle 293 Expression Medium 
(Life technologies) at 37°C in a humidified 5% CO, incubator on a Celltron shaker 
platform (Infors HT) rotating at 130r-p.m. (for 11 culture flasks). Twenty-four 
hours before transfection, cell density was adjusted at 1.5 x 10° cells ml~!, and 
culture grown overnight in the same conditions as mentioned above to reach 
~2.5 x 10° cells ml! the day of transfection. Cells were collected by centrifugation 
at 1,250 r.p.m. for 5 min, and resuspended in fresh FreeStyle 293 Expression 
Medium (Life technologies) without antibiotics at a density of 2.5 x 10° cells ml~!. 

To produce recombinant CEACAM 1a ectodomain, 400 jig of pCD5-MHVR- 
T-Fc vector (purified using EndoFree plasmid kit from Qiagen) were added to 
200 ml of suspension cells. The cultures were swirled for 5 min on shaker in the 
culture incubator before adding 91g ml“! of Linear polyethylenimine (PEI) solu- 
tion (25 kDa, Polysciences). Twenty-four hours after transfection, cells were diluted 
1:1 with FreeStyle 293 Expression Medium and the transfected cells were cultivated 
for 6 days. Clarified cell supernatants were concentrated tenfold using Vivaflow 
tangential filtration cassettes (Sartorius, 10-kDa cut-off) before affinity purification 
using a Protein A column (GE LifeSciences) followed by gel filtration chromatog- 
raphy using a Superdex 200 10/300 GL column (GE Life Sciences) equilibrated in 
20mM Tris-HCl, pH 7.5, 100 mM NaCl. The Fc tag was removed by trypsin cleav- 
age in a reaction mixture containing 7 mg of recombinant CEACAMla ectodomain 
and 51g of trypsin in 100 mM Tris-HCl, pH 8.0 and 20mM CaCh. The reaction 
mixture was incubated at 25°C overnight and re-loaded in a Protein A column to 
remove uncleaved protein and the Fc tag. The cleaved protein was further purified 
by gel filtration using a Superdex 75 column 10/300 GL (GE Life Sciences) equili- 
brated in 20 mM Tris-HCl, pH 7.5, 100 mM NaCl. The purified protein was quan- 
tified using absorption at 280 nm and concentrated to approximately 10 mg ml". 
Production of recombinant MHV S ectodomain in Drosophila S2 cells. To 
generate a stable Drosophila S2 cell line expressing recombinant MHV spike ecto- 
domain, we used Effectene (Qiagen) and 2 1g of the plasmid encoding the MHV 
S protein ectodomain. A second plasmid, encoding blasticidin S deaminase was 
cotransfected as dominant selectable marker. Stable MHV S ectodomain expressing 
cell lines were selected by addition of 101g ml“ blasticidin $ (Invivogen) to the 
culture medium 48 h after transfection. 

For large-scale production of MHV S ectodomain the cells were cultured 
in spinner flasks and induced by 5 jsM CdCl, at a density of approximately 107 
cells per ml. After a week at 28 °C, clarified cell supernatants were concentrated 
40-fold using Vivaflow tangential filtration cassettes (Sartorius, 10-kDa cut-off) 
and adjusted to pH 8.0, before affinity purification using StrepTactin Superflow 
column (IBA) followed by gel filtration chromatography using Superose 6 10/300 
GL column (GE Life Sciences) equilibrated in 20mM Tris-HCl, pH 7.5, 100 mM 
NaCl. The purified protein was quantified using absorption at 280 nm and con- 
centrated to approximately 4mg ml"!. 
SEC-MALS. For size exclusion chromatography coupled with multi-angle light 
scattering (SEC-MALS) analysis, samples (0.2 ml at 1 mg ml~!) were loaded onto 
a Superdex 200 10/300 GL column (GE Life Sciences, 0.4m] min“! in gel filtration 
buffer) and passed through a Wyatt DAWN Heleos II EOS 18-angle laser photom- 
eter coupled to a Wyatt Optilab TrEX differential refractive index detector. Data 
were analysed using Astra 6 software (Wyatt Technology Corp). 
MicroScale Thermophoresis. Solution MicroScale Thermophoresis (MST) 
binding studies were performed using standard protocols on a Monolith NT.115 
(Nanotemper Technologies). In brief, recombinant CEACAM1a ectodomain 
protein was labelled using the RED-NHS (Amine Reactive) Protein Labelling 
Kit (Nanotemper Technologies). The MHV S ectodomain protein was serially 


diluted in 20 mM Tris-HCl, pH 7.5, 100 mM NaCl and the labelled recombinant 
CEACAMl1a was added to a final concentration of 500nM before overnight 
incubation at 4°C. The CEACAM 1a concentration was chosen such that the 
observed fluorescence was approximately 1,000 U at 40% LED power. The samples 
were loaded into standard-treated Monolith capillaries and were measured by 
standard protocols using a Monolith NT.115, NanoTemper. The changes in the 
fluorescent thermophoresis signal were plotted against the concentration of 
the serially diluted MHV spike protein, and Kg values were determined using the 
NanoTemper analysis software. 

CryoEM sample preparation and data collection. Three microlitres of MHV 
spike at 1.85 mg ml”! was applied to a 1.2/1.3 C-flat grid (Protochips), which had 
been glow-discharged for 30s at 20mA. Thereafter, grids were plunge-frozen in 
liquid ethane using a Gatan CP3 and a blotting time of 3.5s. Data were acquired 
using an FEI Titan Krios transmission electron microscope operated at 300kV 
and equipped with a Gatan K2 Summit direct detector. Coma-free alignment was 
performed using the Leginon software**. Automated data collection was carried out 
using Leginon™ to control both the FEI Titan Krios (used in microprobe mode at a 
nominal magnification of 22,500 x) and the Gatan K2 Summit operated in counted 
mode (pixel size: 1.315 A) at a dose rate of ~9 counts per physical pixel per s, which 
corresponds to ~12 electrons per physical pixels per s (when accounting for coinci- 
dence loss**). Each video had a total accumulated exposure of 53 e A~? fractionated 
in 38 frames of 200 ms (yielding movies of 7.6 s). A data set of ~1,600 micrographs 
was acquired in a single session using a defocus range of between 2.0 and 5.0,1m. 
CryoEM data processing. Whole-frame alignment was carried out using the soft- 
ware developed previously*°, which is integrated into the Appion pipeline”, to 
account for stage drift and beam-induced motion. The parameters of the micro- 
scope contrast transfer function were estimated for each micrograph using ctffind3 
(ref. 37). Micrographs were manually masked using Appion to exclude the visible 
carbon supporting film for further processing. Particles were automatically picked 
in a reference-free manner using DogPicker**. Extraction of particle images was 
performed using Relion 1.4 with a box size of 320 pixels” and applying a windowing 
operation in Fourier space to yield a final box size of 288 pixels” (corresponding 
to a pixel size of 1.46 A). From the 1.2 million particles initially picked, a subset of 
50,000 particles were randomly selected to generate class averages using RELION™. 
An initial 3D model was generated using OPTIMOD” within the Appion pipeline. 
The entire data set was subjected to 2D alignment and clustering using RELION 
and particles belonging to the best-defined class averages were retained (~500,000 
particles). These ~500,000 particles were then subjected to RELION 3D classi- 
fication with four classes (using cl symmetry) starting with our initial model 
low-pass filtered to 40 A resolution. We subsequently used the ~230,000 best 
particles (selected from the 3D classification) and the map corresponding to the 
best 3D class (low-pass filtered at 40 A resolution) to run Relion 3D auto-refine (c3 
symmetry), which led to a reconstruction at 4.4 A resolution. We used the particle 
polishing procedure in RELION 1.4 to correct for individual particle movement 
and radiation damage*”. A second round of 3D classification with 6 classes 
(c3 symmetry) was performed using the polished particles resulting in the selection 
of 82,000 particles. A new 3D auto-refine run (c3 symmetry) using the selected 
82,000 particles and the map corresponding to the best 3D class (low-pass filtered 
at 40 A resolution) yielded a map at 4.0 A resolution following post-processing in 
RELION. The final map was sharpened with an empirically determined B factor 
of —220 A? using Relion post processing. Reported resolutions are based on the 
gold-standard Fourier shell correlation (FSC) = 0.143 criterion*’, and Fourier shell 
correction curves were corrected for the effects of soft masking by high-resolution 
noise substitution“. The soft mask used for FSC calculation had a 10 pixel cosine 
edge fall-off. The overall shape and dimensions of our reconstruction agree 
with previous data although the HR2 stem connecting to the membrane is not 
resolved)’. 

Model building and analysis. Fitting of atomic models into cryoEM maps was 
performed using UCSF Chimera* and Coot!*“°. We initially docked the MHV 
domain A structure (PDB 3R4D) and used a crystal structure of a bovine corona- 
virus domain A (PDB 4H14) to model the three-stranded 3-sheet and the «-helix 
present on the viral membrane proximal side of the galectin-like domain. Next, the 
MERS-CoV domain B crystal structure (PDB 4KQZ) was also fit into the density, 
and rebuilt and refined using RosettaCM”’. Although we could accurately align 
the sequences corresponding to the core 3-sheet of the MHV and MERS-CoV B 
domains, the ~100 residues forming the 3-motif extension (residues 453-535, 
MERS-CoV/SARS-CoV receptor-binding moiety) could not be aligned with con- 
fidence. We used RosettaCM to build models of each of the 945 possible disulfide 
patterns into the density for domain B. For each disulfide arrangement, 50 models 
were generated, and there was a very clear energy signal for a single such arrange- 
ment (Extended Data Fig. 3k). Then, 1,000 models with this disulfide arrange- 
ment were sampled, and the lowest energy model (using the Rosetta force field 
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augmented with a fit-to-density score term) was selected. Owing to the poor quality 
of the reconstruction at the apex of the S trimer, the confidence of the model is 
lowest for the segment corresponding to residues 453-535, as homology modelling 
was used to fill in details missing in the map. 

A backbone model was then manually built for the rest of the S polypeptide 
using Coot. Sequence register was assigned by visual inspection where side chain 
density was clearly visible. This initial hand built model was used as an initial 
model for Rosetta de novo’. The Rosetta-derived model largely agreed with the 
hand-built model. Rosetta de novo successfully identified fragments allowing to 
anchor the sequence register for domains C and D as well as for helices a;-035. 
Given these anchoring positions, RosettaCM“”’ augmented with a novel density- 
guided model-growing protocol was able to rebuild domains C and D in full. The 
final model was refined by applying strict non-crystallographic symmetry con- 
straints using Rosetta!’. Model refinement was performed using a training map 
corresponding to one of the two maps generated by the gold-standard refinement 
procedure in Relion. The second map (testing map) was used only for calculation 
of the FSC compared to the atomic model and preventing overfitting*®. The quality 
of the final model was analysed with Molprobity””. Structure analysis was assisted 
by the PISA®’ and DALI”! servers. The sequence alignment was generated using 
MultAlin® and coloured with ESPript*’. All figures were generated with UCSF 
Chimera. 
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Extended Data Figure 1 | Biophysical characterization of the MHV 

S ectodomain. a, The MHV S molecular mass was determined to be 
463.2 + 0.3 kDa (mean +s.e.m.) (corresponding to a trimer) using 
size-exclusion chromatography coupled in-line with multi-angle light 
scattering and refractometry. The blue line represents the normalized 
refractive index (right ordinate axis) and the red line shows the estimated 
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Kp = 48.5 + 3.8 nM 


107 10% 104 105 


MHV S concentration (nM) 


molecular mass (expressed in Da, left ordinate axis). b, MHV S binds with 
high-affinity to the soluble mouse CEACAM 1a receptor. Thermophoresis 
signal plotted against the MHV S concentration. The dissociation constant 
(Ka) was determined to be 48.5 + 3.8 nM. Values correspond to the average 
of two independent experiments. The concentration of CEACAM1a used 
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Extended Data Figure 2 | CryoEM analysis of the MHV S trimer. model/map (red) Fourier shell correlation (FSC) curves. The resolution 
a, b, Representative electron micrograph (defocus: 4.6 jum) (a) and class was determined to 4.0 A. The 0.143 and 0.5 cut-off values are indicated by 
averages (b) of the MHV S trimer embedded in vitreous ice. Scale bars: horizontal grey bars. 


573 A (micrograph) and 44 A (class averages). c, Gold-standard (blue) and 
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Extended Data Figure 3 | CryoEM density for selected regions of the protomer (j), coloured according to local resolution determined with the 
MHYV S reconstruction, local resolution analysis and density-guided software Resmap. We interpret Resmap results as a qualitative (rather than 
homology modelling of residues 453-535. The atomic model is shown quantitative) estimate of map quality. k, Rebuilding of the MHV S domain 
with the corresponding region of the map. a, b, Upstream helix. B using RosettaCM. Plot showing the energy mean and s.d. of the models 
c-e, Helix belonging to domain A (residues 284-296). f-h, Core 3-sheet. corresponding to the 30 lowest energy disulfide arrangements (out of 945) 


i, j, CryoEM density corresponding to the MHV S trimer (i) and a single for domain B. 
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Data collection 


Number of particles 82,000 

Pixel size (A) 1.315 (rescaled to 1.46) 
Defocus range (um) 2-5 

Voltage (kV) 300 
Electron dose (e/A’) 53 

Refinement 
Resolution 4.0 
Map sharpening B factor (A) -220 


Model validation 


Molprobity score (percentile) 1.55 (94" ) 
All-atom clashscore (percentile) 3.68 (97"') 


Poor rotamers (%) 0 
Ramachandran favored (%) 94.26 
Ramachandran allowed (%) 99.44 
Ramachandran outliers (%) 0.56 

r.m.s.d bonds (A) 0.018 

r.m.s.d angles (°) 1.8 


Extended Data Figure 4 | Refinement and model statistics. 
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Extended Data Figure 5 | Structural organization of the S, subunit. 
a, Ribbon diagram showing a single S; protomer. b, Close-up view of the 
MHV S domain B. The structural motif used as a receptor-interacting 
moiety by MERS-CoV and SARS-CoV is indicated. The density was too 
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weak to allow tracing of this segment (residues 453-535), which has been 
traced by density-guided homology modelling using Rosetta. c, d, Ribbon 
diagrams of the S, trimer viewed from the side (c) and from the top 
(looking towards the viral membrane) (d). 
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Extended Data Figure 6 | Mechanisms of membrane fusion promoted 
by coronavirus S glycoproteins. a, Ribbon diagram of the MHV S, pre- 
fusion structure. Disulfide bonds are shown as green sticks. b, Topology 
diagram of the MHV S; pre-fusion structure. PP, di-proline that will act as 
a helix breaker. The presence of these di-proline motifs indicates that the 
post-fusion HR1 coiled-coil could not extend up to the fusion peptide as 
a single helix. This hypothesis is further supported by the observation of 
a conserved disulfide bond formed between residues Cys894 and Cys905 
(labelled 14 in a and b), which will prevent refolding of helices a2 and 
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3 as a single extended helix. c, Ribbon diagram of the SARS-CoV post- 
fusion HR1 helix obtained by X-ray crystallography (PDB 1WYY). The 
residue numbers corresponding to the MHV A59 sequence are indicated. 
d, Topology diagram showing the expected coronavirus S post-fusion 
conformation derived from our MHV S structure and the SARS-CoV 
post-fusion core crystal structure shown in c. e, Ribbon diagram of a 
model of the MHV S; post-fusion conformation. Residues belonging 

to 21, 22, 023, Bag, O24 and Qy5 are not represented owing to a lack of 
structural information. 
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Extended Data Figure 7 | Structural organization of the S, fusion machinery. a, Ribbon diagram of the trimer of central helices. b, c, Ribbon diagrams 
of the S, trimer (starting at residue 755) viewed from the side (b) and from the bottom (looking towards the host cell membrane) (c). Residues Ala994 
and Leu1062, which are discussed in the text, are shown in stick format. 
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Extended Data Figure 8 | Class I viral fusion proteins with exposed fusion peptide. a, MHV S (residues 870-887). b, Parainfluenza virus 5 F (PIV5 F, 
residues 103-128, PDB 2B9B). c, HIV-1 gp41 (residues 518-528, PDB 4T VP). The trimeric fusion proteins are shown as grey ribbon diagrams with the 
fusion peptides rendered in magenta. 
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Extended Data Figure 9 | See next page for figure caption. 
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Extended Data Figure 9 | Sequence conservation among coronavirus S 
glycoproteins. a, Sequence alignment of coronavirus S proteins. Bovine- 
CoV, bovine respiratory coronavirus AH187 (gi 253756585); HKU1, 
human coronavirus HKU1 (gi 545299280); HKU4, tylonycteris bat 
coronavirus HKU4 (gi 126030114); HKUS, pipistrellus bat coronavirus 
HKUS (gi 126030124); MERS-CoV, Middle East respiratory syndrome 
coronavirus (gi 836600681); MHV-A59, mouse hepatitis virus A59 

(gi 1352862); MHV-JHM, mouse hepatitis virus JHM (gi 60115395); 
MHV-2, mouse hepatitis virus 2 (gi 5565844); OC43, human coronavirus 
OC43 (gi 744516696); SARS-CoV, severe acute respiratory syndrome 
coronavirus ZJ01 (gi 39980889); Waterbuck-CoV, waterbuck coronavirus 
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US/OH-WD358-TC/1994 (gi 215478096). Asparagine residues featuring 
N-linked glycan chains visible in the MHV S reconstruction are indicated 
with a star. The S, and S,’ cleavage sites are indicated with scissors at 
positions corresponding to the MHV S sequence. Cysteine residues 
involved in the formation of disulfide bonds are numbered according to 
Supplementary Table 2. The secondary structure elements observed in our 
MHV S reconstruction are indicated above the sequence. The black dotted 
lines above the sequence indicate regions poorly defined in the density. 
Although the viral membrane distal loops of the A domains are weakly 
defined in the density, the availability of a crystal structure of this domain 
from the same virus (PDB 3R4D) helped with the modelling. 
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Extended Data Figure 10 | Structural similarity of B domains among coronavirus S glycoproteins. a, MHV (pink). b, MERS-CoV (orange, PDB 
4KQZ). c, SARS-CoV (red, PDB 2AJF). d, HKU4 (blue, PDB 4QZV). 
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HKU1 is a human betacoronavirus that causes mild yet prevalent 
respiratory disease’, and is related to the zoonotic SARS? and 
MERS? betacoronaviruses, which have high fatality rates and 
pandemic potential. Cell tropism and host range is determined 
in part by the coronavirus spike (S) protein*, which binds cellular 
receptors and mediates membrane fusion. As the largest known 
class I fusion protein, its size and extensive glycosylation have 
hindered structural studies of the full ectodomain, thus preventing 
a molecular understanding of its function and limiting development 
of effective interventions. Here we present the 4.0 A resolution 
structure of the trimeric HKU1 S protein determined using single- 
particle cryo-electron microscopy. In the pre-fusion conformation, 
the receptor-binding subunits, $1, rest above the fusion-mediating 
subunits, $2, preventing their conformational rearrangement. 
Surprisingly, the $1 C-terminal domains are interdigitated and form 
extensive quaternary interactions that occlude surfaces known in 
other coronaviruses to bind protein receptors. These features, along 
with the location of the two protease sites known to be important for 
coronavirus entry, provide a structural basis to support a model of 
membrane fusion mediated by progressive S protein destabilization 
through receptor binding and proteolytic cleavage. These studies 
should also serve as a foundation for the structure-based design of 
betacoronavirus vaccine immunogens. 

Betacoronavirus S proteins are processed into S1 and S2 subunits 
by host proteases’. Like other class I viral fusion proteins, the two 
subunits trimerize and fold into a metastable pre-fusion conforma- 
tion. The S1 subunit is responsible for receptor binding, while the S2 
subunit mediates membrane fusion. Coronaviruses typically possess 
two domains within S1 capable of binding to host receptors: an amino 
(N)-terminal domain (NTD) and a carboxy (C)-terminal domain 
(CTD), with the latter recognizing protein receptors for SARS-CoV 
and MERS-CoV®”. Although these individual domains have been 
structurally characterized, the organization of the complete spike has 
not yet been determined, preventing a mechanistic understanding of 
S protein function. 

Here, we present the structure of the HKU1 S protein ectodomain 
determined using cryo-electron microscopy (cryo-EM) to 4.0 A res- 
olution (Fig. 1a and Extended Data Figs 1 and 2 and Extended Data 
Table 1). The protein construct contains a C-terminal T4 fibritin tri- 
merization motif and a mutated S1/S2 furin-cleavage site (Extended 
Data Fig. 3). The S1 subunit adopts an extended conformation with 
short linkers between domains and sub-domains (Fig. 1b). The SI NTD 
(amino acids 14-297) has strong structural and sequence homology to 
the bovine coronavirus (BCoV) $1 NTD (Extended Data Fig. 4), which 
recognizes acetylated sialic acids on glycosylated cell-surface receptors®. 
The glycan-binding site in the BCoV S1 NTD is conserved in the HKU1 
S1 NTD and is located at the apex of the trimer, oriented towards target 
cells. Indeed, HKU1 S1 was recently shown to bind O-acetylated sialic 


acids on host cells, and these glycans were required for efficient infec- 
tion of primary human airway epithelial cultures’. 

The HKU1S1 CTD (amino acids 325-605) consists of a structurally 
conserved core connected to a large, variable loop (HKU1 S amino 
acids 428-587)!” that is partially disordered (Extended Data Figs 5 
and 6). The CTD is located at the trimer apex close to the threefold 
axis, and the core interacts with the other two $1 CTD cores and with 
one NTD from an adjacent protomer. The domain swapping between 
protomers results in a woven appearance when viewed looking down 
towards the viral membrane (Fig. 2a). Structural alignment of the 
SARS-CoV and MERS-CoV CTD-receptor complexes!!!” with the 
HKU1 pre-fusion S protein reveals that the protein-receptor-binding 
surface of the $1 CTD is buried in the HKU1 S protein trimer and is 
therefore incapable of making equivalent interactions without some 
initial breathing and transient exposure of these domains (Fig. 2b). 


150A 


Viral membrane Mutated furin site 


c s1/s2. $2’ 
NTD CID SD-1 sb-2| HR2 Fd 
14 757 900 1276 


S1 S2 


Figure 1 | Structure of the HKU1 pre-fusion spike ectodomain. 

a, A single protomer of the trimeric S protein is shown in cartoon 
representation coloured as a rainbow from the N to C terminus (blue to 
red) with the reconstructed EM density of remaining protomers shown 
in white and grey. b, The S1 subunit is composed of the NTD and CTD 
as well as two sub-domains (SD-1 and SD-2). The $2 subunit contains 
the coronavirus fusion machinery and is primarily a-helical. c, Domain 
architecture of the HKU1 S protein coloured as in a. 
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Figure 2 | Architecture of the HKU1 S1 subunit. a, EM density 
corresponding to each S1 protomer is shown. The putative glycan-binding 
and protein-receptor-binding sites are indicated with dashed shapes on 

the NTD and CTD, respectively. b, The HKU1 S1 CTD forms quaternary 
interactions with an adjacent CTD using a surface similar to that used 

by SARS CTD to bind its receptor, ACE2 (ref. 11). c, Sub-domain 1 is 
composed of amino acid residues before and after the $1 CTD. d, Sub- 
domain 2 is composed of $1 sequence C-terminal to the CTD, a short 
peptide following the NTD, and the N-terminal strand of S2, which follows 
the S1/S2 furin-cleavage site. 


Although a protein receptor has not yet been identified for HKU1, 
antibodies against the CTD, but not those against the NTD, blocked 
HKU1 infection of cells'*. These data suggest that the $1 CTD is the 
primary HKU1 receptor-binding site'*, whereas the NTD mediates 
initial attachment via glycan binding. 

HKU 1 S1 also contains two sub-domains (which we term SD-1 and 
SD-2) that lack significant homology to previously determined struc- 
tures (Fig. 2c, d). These sub-domains are primarily composed of $1 
amino acid sequences following the CTD. However, stretches of amino 
acids preceding the CTD as well as S2 residues adjacent to the S1/S2 
cleavage site also contribute to the sub-domains. This complex folding 
of elements dispersed throughout the primary sequence may allow 
receptor-induced conformational changes in the CTD to be transmit- 
ted to other parts of the structure. 

In contrast to other viral fusion proteins such as influenza haemag- 
glutinin (HA)!* or HIV-1 envelope (Env)!*"°, the HKU1 $1 subunits are 
rotated about the trimeric threefold axis with respect to the S2 subunits, 
causing the $1 subunit from one protomer to sit above the $2 subunit 
of an adjacent protomer (Extended Data Fig. 7). Similar to HA and 
Envy, a region in the HKU1 S1 CTD (amino acids 371-380) caps the S2 
central helix, thereby preventing the fusion machinery from springing 
into action. 

Processing of coronavirus S proteins by host proteases plays a critical 
role in the entry process®. HKU1 S is cleaved by furin into $1 and $2 
subunits during protein biosynthesis. Though mutated in the protein 
construct used here and disordered in the density map, the HKU1 S 
furin-cleavage site at the $1/S2 junction lies in a loop of SD-2 (Fig. 3 
and Extended Data Fig. 6). Furin cleavage would leave a single S2 
8-strand participating in the SD-2 3-sheets (Fig. 2d). Coronavirus S 
proteins also have a secondary cleavage site, termed $2’ (Arg900)°, 
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Figure 3 | HKU1 S2 subunit fusion machinery. a, The HKU1 S2 subunit 
is coloured like a rainbow from the N-terminal 3-strand (blue), which 
participates in $1 sub-domain 2, to the C terminus (red) before HR2. 

b, The HKU1 S82 structure contains the fusion peptide (FP) and a heptad 
repeat (HR1). Protease-recognition sites are indicated within disordered 
regions of the protein (dashed lines). c, A comparison of coronavirus $2 
HR1 in the pre- and post-fusion” conformations. Five HR1 a-helices are 
labelled and coloured like a rainbow from blue to red, N to C terminus, 
respectively. The structures are oriented to position similar portions of the 
central helix (red). 


adjacent to the viral fusion peptide (amino acids 901-918)!” (Fig. 3b 
and Extended Data Fig. 6). This is similar to the multiple endoprote- 
olytic cleavage events that occur in the fusion proteins of respiratory 
syncytial virus (RSV) and Ebola virus'®!’. Protease cleavage at $2’ likely 
follows S1/S2 cleavage and may not occur until host-receptor engage- 
ment at the plasma membrane or viral endocytosis. 

As in all class I viral fusion proteins, the coronavirus $2 subunit con- 
tains the four elements required for membrane fusion: a fusion peptide 
or loop, two heptad repeats (HR1 and HR2), and a transmembrane 
domain'*”°*!, Refolding of HR1 into a long a-helix thrusts the fusion 
peptide into the host-cell membrane, and as the two heptad repeats 
interact to form a coiled-coil, the host and viral membranes are brought 
together. The fusion peptide, conserved among coronavirus S proteins!” 
(Extended Data Fig. 6), is located on the exterior of the HKU1 S pro- 
tein and is adjacent to the putative S2’ cleavage site, which remains 
uncleaved in our structure. The fusion peptide forms a short helix and 
a loop, with most of the hydrophobic amino acids buried in an interface 
with other elements of $2. Unlike influenza HA where the C terminus of 
the fusion peptide is only 14 amino acids away from the N terminus of 
HRI, the fusion peptide of HKU1 S is 60 amino acids away from HRI. 
This span of protein contains four short a-helices and several longer 
regions lacking regular secondary structure. This intervening sequence 
is also buried beneath SD-2 and the S2’ cleavage site, suggesting that 
cleavage may affect the proclivity of S2 for undergoing the transition 
to the post-fusion conformation. 

Coronavirus S protein heptad repeats are unusually large with HR1 
encompassing more than 90 amino acids”°. In the cryo-EM structure, 
HR2 is located at the base of the HKU1 S protein near the viral mem- 
brane, but is poorly ordered, precluding unambiguous assignment of 
the residues. However, HRI is well ordered and arranged along the 
length of the S2 subunit, forming four short helices and part of the 
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Figure 4 | Comparison of structurally related class I viral fusion proteins. The fusion proteins from coronaviruses, influenza virus and HIV-1 are 
cleaved into receptor-binding subunits (pink, light green, light blue) and the viral fusion machinery (dark red, dark green, blue)!*"'®?8, Comparison to 


other class I fusion proteins can be found in Extended Data Fig. 8. 


central three-helix bundle. This arrangement of HR1 is similar to that 
of influenza HA, although in HA the HR1 is organized as two helices 
connected by a long loop!*. Conversion of influenza HA to the post- 
fusion conformation requires these protein elements to transition into 
a single long «-helix”!. The post-fusion six-helix bundle structures of 
SARS-CoV and MERS-CoV 82 heptad repeats”? reveal that corona- 
virus S proteins also undergo a similar transition (Fig. 3c). However, 
the S protein must carry out five such loop-to-helix transitions, high- 
lighting the complexity of S proteins relative to other class I fusion 
proteins. In addition, the membrane distal regions of the pre-fusion S2 
central three-helix bundle (S2 amino acids 1070-1076), which is the 
C-terminal portion of HR1, are splayed outwards from the threefold 
axis (Extended Data Fig. 7). In the available coronavirus post-fusion 
HR1-HR2 structures, this portion of HR1 forms a tight three-helix 
bundle?*?>, Formation of this three-helix bundle may be prevented 
by interactions between the C-terminal end of the S2 HRI and the 
S1 CTD, and thus disruption of these interactions through receptor- 
induced conformational changes would provide an additional means 
by which receptor binding in S1 can initiate S2-mediated membrane 
fusion. Indeed, protease cleavage and an acidic pH are thought to be 
insufficient to trigger the transition to the post-fusion conformation 
without additional destabilization provided by receptor binding”***. 

The formation of anti-parallel six-helix bundles composed of HR1 
and HR2 in the post-fusion conformation is a unifying feature of class I 
viral fusion proteins. However, the pre-fusion conformations of this 
protein family are incredibly diverse in size and topology (Extended 
Data Fig. 8). The HKU1 S protein structure presented here most closely 
resembles influenza virus HA and HIV-1 Env (Fig. 4), which also have 
receptor-binding subunits that cap the central helix of the fusion sub- 
unit!*+!52728 However, some core elements of the fusion machinery are 
conserved amongst all class I fusion proteins, including paramyxovirus 
F proteins. 

The HCoV-HKUIS protein trimer in a pre-fusion conformation is, 
to our knowledge, the largest class I viral fusion glycoprotein structure 
determined to date (Fig. 4 and Extended Data Figs 8 and 9). Since 
betacoronavirus S proteins are similar in size and have a conserved 
domain organization, our findings should be generally applicable 
to other betacoronaviruses, including SARS-CoV and MERS-CoV 
(Extended Data Fig. 6). Our studies provide a structural basis for S pro- 
tein function wherein the pre-fusion S protein is progressively matured 
and destabilized by receptor binding and protease cleavage. Following 
dissociation of the $1 subunits, HR1 would transition to a long «-he- 
lix, and the fusion peptide would be released from the side of the S2 
subunit and inserted into host membranes. The structure and mecha- 
nistic insights presented here should enable engineering of pre-fusion 
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stabilized coronavirus S proteins as vaccine immunogens against cur- 
rent and emerging betacoronaviruses, similar to recent efforts for other 
viral fusion proteins”?°. This work also acts as a springboard for future 
studies to define mechanisms of antibody recognition and neutrali- 
zation, which will lead to an improved understanding of coronavirus 
immunity. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The investigators were not blinded to allocation during experiments and outcome 
assessment. 

Protein expression and purification. A mammalian-codon-optimized gene 
encoding HKU1 S (isolate N5, NCBI accession QOZME7) residues 1-1276 with 
a C-terminal T4 fibritin trimerization domain, a HRV3C cleavage site, and a 
6xHis-tag was synthesized and subcloned into the eukaryotic expression vec- 
tor pVRC8400. The S1/S2 furin-recognition site 752-RRKRR-756 was mutated 
to GGSGS to generate the uncleaved construct used for cryoEM studies. Three 
hours after this plasmid was transfected into FreeStyle 293-F cells (Invitrogen), 
kifunensine was added to a final concentration of 51M. FreeStyle 293-F cells area 
high-transfection-efficiency cell line adapted for suspension culture derived from 
low passage clonal cultures and after purchase were not further authenticated. 
Cells were not confirmed to be free of mycoplasma, but were only used for pro- 
tein expression. Cultures were harvested after six days, and protein was purified 
from the medium using Ni-NTA Superflow resin (Qiagen). The buffer was then 
exchanged using a HiPrep 26/10 desalting column (GE Healthcare Biosciences) 
from a high-imidazole elution buffer to a low pH buffer (20 mM Bis-Tris pH 6.5, 
150 mM NaCl). Afterward, endoglycosidase H (EndoH) (10% w/w) and HRV3C 
protease (1% w/w) were added to the protein and the reaction was incubated over- 
night at 4°C. The digested protein was further purified using a Superose 6 16/70 
column (GE Healthcare Biosciences). 

The furin-cleaved HKU1S construct analysed by negative-stain EM was similar 
to the one described above except that it encoded residues 1-1249 and contained 
the wild-type RRKRR furin-recognition site. Expression and purification were also 
similar, except that a plasmid expressing furin was co-transfected into the FreeStyle 
293-F cells to ensure complete processing of the protein. 

Sample preparation for negative-stain electron microscopy. HKUI S proteins 
were placed directly onto 400 copper mesh grids and then stained with 1% uranyl 
formate. Tris-buffered saline (TBS) was used as buffer if dilution was necessary. 
Negative-stain electron microscopy data collection. Grids were loaded into a 
Tecnai T12 Spirit operating at 120 keV and imaged using a Tietz TemCam-F416 
CMOS at 52,000 x magnification at ~1.5 1m under focus. Micrographs were 
collected using Leginon*! and processed within Appion™. Particles were picked 
using a difference-of-Gaussians approach** and aligned using reference-free 2D 
classification employing iterative multivariate statistical analysis/multi-reference 
alignment (MRA/MSA) using a binning factor of 2 to remove amorphous parti- 
cles**. Particles in classes that did not represent views of HKU1 S proteins were 
discarded. ISAC* was used to generate a template stack from which initial 3D 
models were generated using the EMAN2 (ref. 36) procedure initialmodel.py. 3D 
models were refined using EMANI (ref. 37). 

Sample preparation for cryo-electron microscopy. Sample solution (3 11) was 
applied to the carbon face of a CF-2/2-4C C-Flat grid (Electron Microscopy 
Sciences, Protochips) that had been plasma cleaned for five seconds using a mix- 
ture of Ar/O, (Gatan Solarus 950 Plasma system). The grid was then manually 
blotted and immediately plunged into liquid ethane using a manual freeze plunger. 
Cryo-electron microscopy data collection. Movies were collected via the 
Leginon interface on a FEI Titan Krios operating at 300 keV mounted with a 
Gatan K2 direct-electron detector*!. Each movie was collected in counting mode 
at 22,500 x nominal magnification resulting in a calibrated pixel size of 1.31 A/pix 
at the object level. A dose rate of ~10 e~/((cam pix) x s) was used; exposure time 
was 200 ms per frame. The data collection resulted in a total of 1,049 movies con- 
taining 50 frames each. Total dose per movie was 57 e~/A”. Data were collected at 
1.0 to 3.544m under focus. 

Cryo-electron microscopy data processing. Frames in each movie were aligned”, 
and CTF estimation was carried out using CTFFIND3 (ref. 39). Particles were 
picked from a subset of the data employing a difference-of-Gaussians approach”? 
and aligned using reference-free 2D classification employing iterative MRA/MSA 
using a binning factor of two™*. The resulting 2,188 particles were used to generate 
an initial 25 A lowpass-filtered 3D reconstruction using EMAN2. SPIDER refproj. 
spi’? with a delta theta angle of 15 degrees was used to generate 83 projection 
images of the initial 3D reconstruction. These projection images were used as 
templates for picking particles from the entire cryo data set. Particles from the 
entire data set were aligned and classified with the same methods used for the 
subset of particles stated above. After 2D classification, unbinned selected particles 
were symmetrically refined in RELION version 1.3 (refs 41, 42) against the initial 
3D reconstruction filtered to 60 A resolution. This refinement was followed by 
particle polishing and refinement of the resulting realigned, B-factor-weighted 
and signal-integrated particles using RELION version 1.4b1. The resolution of the 
final map was 4.04 A at an FSC cutoff of 0.143. A mask was generated in RELION 
using a threshold that accounted for the entire structure. From this threshold, the 
mask was further dilated by 3 voxels and a Gaussian fall-off was generated over an 


additional 6 voxels. The mask effect on FSC was taken into consideration. Phases 
were randomized in the unfiltered half-set maps for initial FSC lower than 0.8 
and a new FSC between these phase-randomized maps was generated and used to 
correct for mask effects in the final FSC-based resolution estimate. The reported 
resolution of 4.04 A is the RELION CorrelationCorrected value 

The map was B-factor sharpened employing FSC-weighting. The B-factor 
was estimated in RELION based on the resolution range from 10 A to 2.62 A 
(B-factor = —117 A?). The detector MTF file was provided to RELION. 
Model building and refinement. An initial model of the S1 NTD was generated 
using the Modeller* homology modelling tool in UCSF Chimera with the BCoV 
NTD (PDB 4H14)*as a template. The NTD homology model was docked into the 
HKU1S protein EM density and refined with Rosetta density-guided iterative local 
refinement* while imposing C3 symmetry. Rosetta output models were clustered 
based on pairwise r.m.s.d. using a cluster radius of 2.15 A. The lowest energy model 
from the largest cluster was selected for additional refinement. This model and the 
conserved CTD core from SARS-CoV (PDB 2AJE)!! were used as starting struc- 
tures for model building and refinement. These starting models and the remaining 
HKU! protein sequence were modelled manually using COOT* and refined using 
RosettaRelax’”. Structures were evaluated using EMRinger*® and Molprobity”. 
Figures were produced in the PyMol*? or UCSF Chimera“ software packages. 
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Extended Data Figure 1 | Data processing flowchart. a, Processing resolution of 4.04 A is indicated in the plot. c, Angular distribution of 
resulting in density map of pre-fusion HKU1 spike glycoprotein at 4.04 A raw data within the data set. A slight, but within normal range, over- 
resolution. b, FSC plot illustrating correlation between two volumes representation of top views was observed (tall red bars). 


refined independently from two distinct half sets of raw data. A final 
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Extended Data Figure 2 | Resolution of the pre-fusion HKUI1 S density in stable internal secondary structures to greater than 5.00 A in flexible 


map. a, Local resolution within the EM density map. Local resolution peripheral loops. b, Close-ups of secondary-structure densities. To the 
was calculated using ResMap”! discretizing every 0.25 A over a range left is displayed the central «-helix of an $2 monomer and to the right is a 
from 2 x voxel size (2.62 A) to 4 x voxel size (5.24 A). Resolution 3-sheet from the NTD domain in an $1 monomer. 


significance criterion was set to 0.05. The resolution ranges from 3.74A 
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Wild-type S1/S2 Cleavage Site + - - 

Foldon trimerization domain + + - 
Extended Data Figure 3 | Cleavage at the $1/S2 junction does not spike 1-1276 with an attached foldon and a mutated furin-cleavage site 
induce large conformational changes in HKU1 spike. a, HKU1 spike reconstructed using negative-stain electron microscopy. c, HKU1 spike 
1-1249 with an attached foldon domain and wild-type furin-cleavage site 1-1249 without foldon and with mutated furin-cleavage site. Side and top 
was reconstructed using negative-stain electron microscopy. b, HKU1 views are shown. 
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Extended Data Figure 4 | Putative glycan binding site of the HKU1 involved in the putative glycan-binding site (dashed circle) are shown 
$1 NTD. a, HKU1 trimeric S and b, an isolated monomer. Putative host as sticks, with oxygen atoms coloured red and nitrogen atoms coloured 
glycan-binding and protein-receptor-binding sites are indicated. c, The blue. Note that N198 (BCoV) and N188 (HKU1) are predicted N-linked 
bovine coronavirus (BCoV) $1 NTD structure from Peng et al.* (teal) glycosylation sites. 


is superposed onto the HKU1 S NTD (pink). Residue side-chains 
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Extended Data Figure 5 | Betacoronavirus S proteins possess a secondary structure: grey) and the insert which differs amongst 
conserved structural core in their C-terminal domains. a, The coronaviruses is coloured yellow. Atoms participating in quaternary 
structurally divergent loop of the $1 CTD is poorly ordered distal to the interactions with other HKU1 S protomer CTDs are shown in green 
core CTD domain. The conserved $1 CTD cores!” of b, HKU1-CoV surface in c. f, The positions of these interacting atoms are mapped on to 
highlighted in the trimeric pre-fusion S, c, HKU1-CoV as an isolated the conserved core topology. The sheet and helix nomenclature is taken 
domain, d, MERS-CoV’’ and e, SARS-CoV"! are coloured according from reference 10. 


to secondary structure (3-sheets: pink, a-helices: blue, lacking regular 
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Extended Data Figure 6 | Sequence alignment of human C-terminal domain (CTD) which contains the large variable loop, the 
betacoronavirus S proteins. Sequence alignment of S proteins from S1/S2 and $2’ cleavage sites, fusion peptide (FP), heptad repeats 1 and 2 
HKUI, SARS-CoV and MERS-CoV using Clustal Omega>’. Protein (HR1, HR2) and transmembrane helix (TM). 


features described in the text are indicated: N-terminal domain (NTD), 
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Extended Data Figure 7 | $1 sits atop an adjacent protomer’s $2. a, The 
HKU1 S1 subunits are rotated about the trimeric threefold axis relative to 
their corresponding S2 subunits such that the $1 CTD from one protomer 
caps the S2 central helix from an adjacent protomer (CTD, blue, caps 
$2, red). The third protomer of the trimer has been omitted for clarity. 

b, HKU1 S1 CTD (blue) uses a short helix to cap the central helix and 
HRI (red). c, The influenza haemagglutinin HA2 central helix (red) is 
also capped by a helix in HAI (blue)'*”®. d, The $2 N-terminal 8-strand 
is connected to the remainder of the $2 subunit via a loop and an a-helix 


(dotted lines). These regions of the EM density are of insufficient quality 
to confidently build this protein region but enable interpretation of 
connectivity. e, In the pre-fusion HKU1 S protein, the tops of the central 
S2 helices (blue, red, green) are splayed outwards from the threefold axis 
and capped by the S1 CTDs (white). The $1 NTD, SD-1 and SD-2 have 
been omitted for clarity. f, In the post-fusion six-helix-bundle structure of 
SARS S”, the corresponding helical regions from (e) form a well-packed 
three-helix bundle. 
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Extended Data Figure 8 | Class I viral fusion proteins. All class I fusion 
proteins require proteolytic cleavage adjacent to the fusion peptide or loop, 
and the metastable pre-fusion state is triggered by a series of events that 
involve pH change or receptor binding. The post-fusion conformations 

all contain anti-parallel six-helix bundles composed of the HR1 and 

HR2 from the membrane-proximal subunit. However, there is a great 
diversity in pre-fusion conformations as shown here. Members of this 

class that also participate in receptor binding!*"1»3 (top row), including 


HIV-1 Env 


Ebolavirus GP 


S glycoproteins of coronaviruses, are organized such that their receptor 
binding subunits sit atop the fusion machinery, and need to be shed in 
order for membrane fusion to proceed. Paramyxovirus F proteins**°” 
(bottom row) have a different architecture than the capped fusion proteins 
on the top row. The F proteins all have disulfide bonds between the 
membrane proximal and membrane distal subunits, and the two subunits 
remain interconnected throughout the rearrangement process. 
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b 
Extended Data Figure 9 | HKU1 S glycosylation. a, Sites of N-linked of density in the EM map is observed for 10 sites corresponding to the 
glycosylation on the HKU1 S trimer and b, a single monomer. Of the EndoH-trimmed sugars. Asparagines where glycan density is observed are 
30 potential N-linked glycosylation sites in a single protomer, the shown as magenta spheres. Asparagines lacking glycan density are shown 
asparagine residues are observed for 21 sites and of these a small portion in green. 
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Extended Data Table 1 | CryoEM data collection, processing and refinement metrics 


Data collection/processing 


Microscope Titan Krios 
Voltage (keV) 300 
Defocus range (um) IMtosS 
Movies 1,049 
Frames per movie 50 
Exposure time per frame (ms) 200 
Magnification 22,500x 
Dose rate (e7/pixel/s) 10 
Total dose per movie (e-/A?) 57 
Particles 31,435 
Map Resolution (A) 4.04 
Model Refinement 
Chimera CC*4 0.87 
EMRinger Score*® 27 
MolProbity*? 1.6 
Clashscore‘*? 3.0 
Ramachandran (%)*9 

Favored 92.1 

Allowed 7.0 

Outliers 0.9 


CC=cross correlation 
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Crystal structure of eukaryotic translation initiation 


factor 2B 


Kazuhiro Kashiwagi!?*, Mari Takahashi’, Madoka Nishimoto*, Takuya B. Hiyama!?, Toshiaki Higo*, Takashi Umehara*®, 


Kensaku Sakamoto~’, Takuhiro Ito!*? & Shigeyuki Yokoyama 


Eukaryotic cells restrict protein synthesis under various stress 
conditions, by inhibiting the eukaryotic translation initiation 
factor 2B (eIF2B)!. eIF2B is the guanine nucleotide exchange 
factor for eIF2, a heterotrimeric G protein consisting of a-, 3- and 
-\-subunits. eIF2B exchanges GDP for GTP on the +-subunit of 
eIF2 (eIF2-y), and is inhibited by stress-induced phosphorylation 
of eIF2a. eIF2B is a heterodecameric complex of two copies each 
of the a-, B-, +-, 6- and ¢-subunits?; its «-, B- and 5-subunits 
constitute the regulatory subcomplex’, while the +- and e-subunits 
form the catalytic subcomplex*. The three-dimensional structure 
of the entire eI[F2B complex has not been determined. Here we 
present the crystal structure of Schizosaccharomyces pombe elF2B 
with an unprecedented subunit arrangement, in which the 02325 
hexameric regulatory subcomplex binds two ye dimeric catalytic 
subcomplexes on its opposite sides. A structure-based in vitro 
analysis by a surface-scanning site-directed photo-cross-linking 
method identified the e[F2a-binding and elF21-binding interfaces, 
located far apart on the regulatory and catalytic subcomplexes, 
respectively. The elF2~-binding interface is located close to the 
conserved ‘NF motif’, which is important for nucleotide exchange. 
A structural model was constructed for the complex of eIF2B 
with phosphorylated eIF2a, which binds to eIF2B more strongly 
than the unphosphorylated form. These results indicate that the 
eIF2a phosphorylation generates the ‘nonproductive’ eIF2-elF2B 
complex”, which prevents nucleotide exchange on eIF2+, and thus 
provide a structural framework for the eIF2B-mediated mechanism 
of stress-induced translational control. 

In eukaryotic translation initiation, eIF2 in the GTP-bound form 
delivers an initiator methionyl-tRNA (Met-tRNA;™) to the ribo- 
some, and then dissociates in the GDP-bound form”. For the next 
round of translation initiation, eI[F2B catalyses the exchange of the 
eIF2)-bound GDP for GTP. This guanine nucleotide exchange activ- 
ity requires the HEAT domain at the carboxy (C) terminus and the 
NEF motif in the amino (N)-terminal region of the e-subunit, as well 
as the formation of the catalytic ye subcomplex*”. The NF motif con- 
sists of consecutive Asn—Phe residues conserved in elF2Be (Asn237 
and Phe238 in S. pombe eIF2Be), and their mutations reduce the 
nucleotide exchange activity drastically, down to the level of a HEAT- 
domain fragment’. Under stressful conditions, the phosphorylation of 
eIF2o at Ser51 induces its stronger binding to the regulatory «6 sub- 
complex of eIF2B, which results in the inhibition of elF2B**. This 
inhibition limits the supply of Met-tRNA;“ to the ribosome, leading 
to global translational repression and de-repression of the translation 
of stress-induced mRNAs!”*, However, the mechanisms of the catalysis 
and the inhibition and their mutual relationship have remained elusive, 
and information about the overall structure of eI[F2B has long been 
awaited. Importantly, a variety of mutations of the human elF2B subu- 
nits are related to the neurodegenerative disease leukoencephalopathy 
with vanishing white matter (VWM) or childhood ataxia with central 


1,2,4 


nervous system hypomyelination (CACH)””. In patients with this dis- 
ease, white matter lesions and neurological disorders severely deterio- 
rate during recovery after exposure to stresses, and the eIF2B activities 
are generally lower than normal'!. Thus, the structure of e[F2B would 
also provide information underlying the pathogenesis of VWM/CACH. 

We prepared S. pombe elF2B, by co-expressing all subunits in 
Escherichia coli (Kashiwagi et al., submitted). We examined the nucle- 
otide exchange activity against Komagataella pastoris (Pichia pastoris) 
eIF2, which shares high sequence identity with S. pombe elF2 (a: 66%; 
3: 47%; \: 76%). This recombinant eIF2B molecule exhibited the nucleo- 
tide exchange activity on K. pastoris elF2, and was inhibited by the 
phosphorylated eIF2 protein (eIF2(aP)) (Extended Data Fig. 1a). The 
eIF2B molecule showed stronger binding to both the trimeric eIF2(aP) 
complex and the phosphorylated eIF2a subunit (P-eIF2«) (Extended 
Data Fig. 1b, c). Therefore, the recombinant eIF2B molecule displayed 
the characteristic biochemical properties of the natural eIF2B. 

We determined the crystal structure of elF2B at 3.0-A resolution, 
and assigned almost all regions except for the ¢-subunit HEAT domains 
(Fig. la, Extended Data Fig. 2a, b and Extended Data Table 1). The crys- 
tal structure revealed an unprecedented arrangement of the subunits: 
the hexameric regulatory subcomplex resides at the centre, with the two 
heterodimeric catalytic subcomplexes bound on opposite sides. The 
assembly of the subcomplexes is primarily mediated by the B-< and 
6-7 interactions (Fig. 1a and Extended Data Fig. 2a, b). This decameric 
structure is consistent with many of the numerous previously reported 
results of subunit interactions'”-!° (Extended Data Fig. 2). Mapping of 
the residues corresponding to the missense mutations causing VWM 
disease (Fig. 1b and Supplementary Table 1) revealed that many muta- 
tions are located within or around various subunit interfaces (yellow in 
Fig. 1b and Extended Data Fig. 3a-c). These interface mutations may 
cause substantial effects on the structural and biochemical properties 
of eIF2B. 

Two key motifs for the nucleotide exchange activity, the HEAT 
domain and the NF motif of the ¢-subunit, both reside in the ‘distal’ 
region of the structure, and several missense VWM mutations are 
mapped near the NF motif (Fig. 1c and Extended Data Fig. 3d). 
Therefore, nucleotide exchange on eIF2 occurs on the distal face 
of eIF2B. To examine how elF27 binds to this face, we performed 
surface-scanning photo-cross-linking experiments (see Methods). 
Thirty-two variants of eIF2B, labelled with p-benzoyl-L-phenylalanine 
(pBpa), were ultraviolet-irradiated in the presence of K. pastoris eIF2 
(Extended Data Fig. 4a, b). Photo-cross-linking was detected between 
eIF2, and the distal face of eIF2B, when pBpa was incorporated at one 
of the ten sites distributed over a large area, extending from eIF2Be to 
eIF2By (Fig. 1c). The phosphorylation of eIF2 retarded cross-linking 
at the sites near the NF motif (Gln117(2Be) and Leu257(2Be)), demon- 
strating that the interaction of the y-subunit of eIF2(aP) with the NF 
motif-surrounding region of elF2Be is much less efficient than that of 
the unphosphorylated form (Extended Data Fig. 4b, c). This observation 
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Figure 1 | Overall structure of S. pombe e[F2B and mapping of human 
VWM mutations and eIF2+ cross-links. a, The crystal structure of 

S. pombe elF2B (wall-eyed stereo view). The a-, (-, y-, 6- and e-subunits 
are coloured blue, cyan, orange, green and pink, respectively. The NF 
motifs in the e-subunits are shown by red sticks. The visible C-terminal 
Ca atoms of the e-subunits are shown by spheres, and the HEAT domains, 
whose electron densities were not observed in our structure, extend 
outwards. b, Mapping of the S. pombe eIF2B residues corresponding to 


may indicate why eIF2(aP) is a poor substrate for nucleotide exchange 
by eIF2B. Meanwhile, cross-linking at the sites distant from the NF 
motif (Glu204(2Be) and Ser258(2By)) was hardly affected by the elF2 
phosphorylation (Extended Data Fig. 4b, c). The different effects of the 
elF2 phosphorylation may reflect the dual functions of the catalytic 
subcomplex on eIF2 (ref. 17). The catalytic subcomplex also catalyses 
the displacement of elF5, which detaches from the ribosome together 
with eIF2 and inhibits the dissociation of eIF2y-bound GDP (ref. 18). 
This activity is insensitive to phosphorylation and impaired by 
elF2B mutations!”. Therefore, the observed phosphorylation- 
insensitive cross-linking at the distant sites is likely to have trapped 
the interaction for this step. Furthermore, several VWM mutations 
are mapped on the same surface area of the elF2By subunit as the 
phosphorylation-insensitive cross-linking (Fig. 1c and Extended Data 
Fig. 3d), suggesting that defects in the eIF5 displacement are relevant 
to VWM disease in these cases. 

We next investigated the interaction between the eIF2B regulatory 
subcomplex and P-elF2a. pBpa was incorporated at 80 sites in the reg- 
ulatory a-, 3- and 6-subunits, and 13 of them were cross-linked with 
S. pombe P-elF2a. The cross-linked sites are distributed in all three 
regulatory subunits, and are clustered within the cavity-like regions 
of the eI[F2B regulatory subcomplex (Fig. 2a and Extended Data 
Fig. 5a, b). The cavities are formed around the centre of one set of the 
a-, B- and 6-subunits (‘central cavity’) on the top/bottom ends of 
the hexameric regulatory subcomplex (Fig. la). Thus, e[F2B pos- 
sesses two P-eIF2a-binding sites, in agreement with the ITC results 
(Extended Data Fig. 1c). This cavity bears some residues with muta- 
tions that have been isolated as Gcn’ (general control non-depressible) 
mutations!?°, which prevent cells from inducing translational control 
upon elF2 phosphorylation, and their positions overlap with the elF2a- 
cross-linking sites (Extended Data Fig. 6a—c). In addition, the majority 
of the Gcn” mutation positions are located at the interfaces between the 
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VWM.-causing missense mutations of human eIF2B (Supplementary 
Table 1) as spheres on the S. pombe eIF2B structure. Their environments 
are colour-coded on the spheres (green, solvent exposed; yellow, subunit 
interface; brown, structural core). c, Mapping of the S. pombe eIF2B 
residues corresponding to the solvent-exposed VWM-causing missense 
mutations of human eIF2B (green) and to the pBpa cross-links with 

K. pastoris elF24, through the interaction between eIF2B and elF2 (teal), 
on the surface model. The NF motif is coloured red. 


regulatory dimers (Extended Data Fig. 6d, e), demonstrating that the 
correct assembly of the regulatory subunits is requisite for the strong 
binding with P-eIF2a and the induction of translational control. In 
contrast, fewer missense VWM mutations were identified in the central 
cavity (Extended Data Fig. 3f). Therefore, it seems that VWM disease 
occurs in most cases by mechanisms unrelated to the inhibition of the 
GEF activity of eIF2B by the eI[F2a phosphorylation. Intriguingly, a 
similar cross-linking pattern was observed without eI[F2~ phosphoryl- 
ation (Extended Data Fig. 5c), although the binding was weaker. The 
difference was that additional cross-links occurred at Arg84(2B) and 
Gln91(2B) when eIF2a was not phosphorylated, and both of these 
residues exist in the interior of the cavity (Extended Data Fig. 5d). This 
suggests that the central cavity can accommodate both phosphorylated 
and unphosphorylated eIF2a in similar manners, but the interaction 
with elF2a is somewhat delocalized in the absence of phosphorylation. 

To delineate how eIF2a is accommodated in the central cavity, we 
introduced pBpa at 27 sites in the N-terminal domain of S. pombe elF2a 
(eIF2a-NTD) and examined the cross-linking with the elF2B regula- 
tory subunits. We successfully detected the cross-linking of eI[F2a with 
eIF2Ba and eIF2B6 (Extended Data Fig. 7a, b). Mapping of the cross- 
linked sites onto the human eIF2« structure”! revealed that the tip of 
eIF2a-NTD is located close to e[F2Ba and elF2B@ (Fig. 2b). On the 
basis of these complementary experiments, the structures of e[F2a and 
eIF2B were docked in accordance with the cross-linking results (Fig. 2c), 
which indicated that eIF2a-NTD is buried in the central cavity along 
with the notch between the 8- and 6-subunits of eIF2B. The elF2a 
KGYID sequence (residues 79-83), which is important for the strong 
binding to elF2B*”’, closely faces eIF2Ba (Fig. 2c, inset). The residues 
Ser48, Leu84 and Val89 of e[F2a, whose mutations suppress the inhib- 
itory effect of the eIF2« phosphorylation”, are located at the bottom 
of the cavity. Therefore, our docking model seems to be reasonable. In 
the model, the phosphorylated residue Ser51(2«) is also located at the 
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Figure 2 | Photo-cross-linking between eIF2a and the central cavity of 
the eIF2B regulatory subunit. a, Mapping of the cross-linked (orange) 
eIF2B sites with S. pombe elF2a on the crystal structure (surface model) 
(see Extended Data Fig. 5a). Asn100(2B{) is disordered in this structure. 
Inset: the same view as in Fig. 1a (surface model). b, Mapping of the eIF2a 
sites cross-linked with e[F2Ba and eIF2B8 on the human elF2q structure”! 
(ribbon model; see Extended Data Fig. 7a, b). The sites cross-linked with 
eIF2Ba, elF2B6 and both eIF2Ba and eIF2Bf are represented as blue, cyan 
and purple spheres, respectively. The phosphorylated residue Ser51(2«) 

is represented as the red sphere. The KGYID sequence is coloured brown. 
c, Docking of the eIF2B and eIF2a structures, based on our photo-cross- 
linking experiments. Inset: the same view as in a. 


bottom of the cavity (Fig. 2c, inset). The cavity has no positively charged 
patch suitable for the accommodation of a phosphate group (Extended 
Data Fig. 7c, d). Therefore, the mechanism underlying the enhanced 
affinity for P-eIF2q is still unclear. 
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Considering our finding that the cross-linking at the sites around the 
NF motif is impaired by the eIF2 phosphorylation, as described above, 
eIF2 seems to minimally contact the NF motif when P-eIF2a-NTD 
binds to the central cavity. On the basis of the docking model of 
P-eIF2a-NTD captured by the central cavity, we further docked the 
trimeric alF2, the archaeal homologue of elF2 (ref. 24), on the eIF2B 
structure. This docking model indicates that it is difficult for the alF2/ 
eIF2 captured by the central cavity to simultaneously interact with 
the NF motif (Fig. 3a). Therefore, the central-cavity-captured state is 
the ‘nonproductive’ state, which is distinct from the ‘productive’ state 
for efficient nucleotide exchange (Fig. 3b, left)°. When elF2c is not 
phosphorylated, e[F2y-bound GDP is exchanged with GTP by the 
HEAT domain and the NF motif, because the nonproductive state 
is not stable (Fig. 3b, left). The stress-induced elF2« phosphoryla- 
tion stabilizes the nonproductive state by the stronger interaction in 
the central cavity, thereby impeding the nucleotide exchange on the 
bound eIF2 and the entry of another elF2 (Fig. 3b, right). According 
to this model, the Gcn” mutations, which alleviate translational control 
upon elF2 phosphorylation, should destabilize the strong interaction 
in the central cavity. We selected two Gcn™ mutations in the central 
cavity, Glu57Lys(2Ba) and Asp248Lys(2B8) (Glu44Lys(2Ba) and 
Glu377Lys(2B8) in Saccharomyces cerevisiae) (Extended Data Fig. 8a). 
These eIF2B mutations abrogated its strong interactions with eIF2(aP) 
and P-elF2a (Extended Data Fig. 8b, c). The GDP exchange assay also 
revealed the alleviation of the inhibition by eIF2(aP) (Extended Data 
Fig. 8d). Therefore, these Gcn” mutations diminish the strong interac- 
tion in the central cavity, thus restoring efficient nucleotide exchange 
(Fig. 3b). 

These analyses additionally revealed that the Gcn™ mutations also 
affect the interaction with unphosphorylated eIF2a, and enhance 
the nucleotide exchange activity of elF2B (Extended Data Fig. 8c, e). 
The nucleotide exchange activity of eIF2B is also enhanced by a small 
molecule called integrated stress response inhibitor (ISRIB)*>”°, which 
was identified as a compound that blocks the stress response’’. The 
ISRIB-resistant mutations” are located near the pseudo-twofold rota- 
tional axis of eIF2B (Extended Data Fig. 9). Therefore, our structure is 
consistent with the proposed models of action of ISRIB, in which this 
symmetric molecule stabilizes the eIF2B decamer”*”®, 

The current study of e[F2B provides the structural basis not only for 
an overall understanding of its function in translational control, but 
also of the mechanisms of VWM. We have focused so far on the struc- 
tural integrity and the catalytic activities of nucleotide exchange and 
eIF5 displacement, as the causes of this disease. However, VWM dis- 
ease can reportedly result from alterations in other, as yet undescribed, 
elF2B functions”*”’. Mapping of missense VWM mutations on the 
structure actually revealed the existence of some exposed mutations 


Figure 3 | Model of the eIF2-eIF2B interactions. 
a, Docking of the eIF2 and elF2B structures, by 
fixing e[F2a-NTD in the central cavity of eIF2B. 
eIF2a-NTD is placed in the same manner as in 
Fig. 2c. The structures of alF2«-CTD and aIF2 
from PDB accession number 2QMU“,, the archaeal 
homologues of the corresponding segments of 
elF2, are positioned so they extend towards the NF 
motif (red) from the C terminus of e[F2a-NTD. 

b, Schematic representations of the e[F2-eIF2B 
interactions proposed in the present study. 


elF2B + elF2(0P) 


Tight 
interaction 
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that may not be explained by defects in the known functions (for example, 
7Arg312Gln and ~\[le346Thr in Extended Data Fig. 3b and the muta- 
tions in Extended Data Fig. 3e). The present decamer structure is 
expected to contribute to the elucidation of the currently undescribed 
mechanisms of this disease. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Expression and purification of recombinant S. pombe eIF2B. The detailed 
methods for expression and purification of the native eIF2B protein are described 
elsewhere (Kashiwagi et al., submitted). For the production of the selenome- 
thionine (SeMet)-substituted protein, the cells were grown in M9 medium until 
the absorbance, A¢oo nm, reached 0.4, supplemented with SeMet and amino 
acids, grown for 30 min, induced with 0.5 mM IPTG, and further grown at 18°C 
overnight*’. The SeMet-derivative protein was also purified in the same manner as 
described, except the DTT concentration was increased to 10mM during the puri- 
fication and the Ni-Sepharose flow-through step was omitted. For cross-linking 
experiments, the epitope tags were fused to the regulatory subunits (a: myc-tag 
at the N terminus; 3: HA-tag at the N terminus; 6: Strep-tag at the C terminus) 
using a PrimeSTAR Mutagenesis Basal Kit (Takara). 

Purification of K. pastoris eIF2. We used K. pastoris eIF2 for nucleotide exchange 
assays and SEC analyses, because a high quality sample of K. pastoris eIF2 can be 
prepared more efficiently than S. pombe elF2, by the following methods. First, the 
DNA sequence encoding the THF (Thrombin-His¢-3 x Flag) tag was inserted into 
the K. pastoris genomic locus corresponding to the C terminus of eIF2, by homol- 
ogous recombination*!. The resultant K. pastoris strain producing the THF-tagged 
elF2, was grown in medium containing 1% yeast extract, 2% tryptone and 5% 
(v/v) glycerol at 30°C. Harvested cells were suspended in 75 mM Tris-OAc buffer 
(pH 7.5), containing 300 mM KOAc, 10% (v/v) glycerol, 5mM Mg(OAc)2, 1 mM 
EDTA, 2mM DTT and protease inhibitors, and were disrupted using an FPG12800 
Pressure Cell Homogenizer (Stansted Fluid Power). After centrifugation, the super- 
natant was purified by chromatography on Q-Sepharose, Ni-Sepharose, HiTrap 
Heparin, MonoS and Superdex 200 (GE Healthcare) columns. The final buffer was 
20mM HEPES-KOH buffer (pH 7.5), containing 150 mM KOAc, 5% (v/v) glycerol, 
5mM Mg(OAc), 0.1 mM EDTA and 1mM DTT. 

Expression and purification of recombinant S. pombe eIF2a. S. pombe elF2a 
was produced as the N-terminally GST-tagged and C-terminally Hisg-Flag-tagged 
protein, in E. coli Rosetta2 (DE3). Transformed cells were grown in lysogeny broth 
(LB) medium at 37°C. After the addition of 0.3 mM IPTG when the culture reached 
A6o0nm= 0.4, the cells were grown at 18°C overnight. The cells were lysed by soni- 
cation in 20mM HEPES-KOH buffer (pH 7.5), containing 150 mM KCl, 10% (w/v) 
glycerol, 1mM DTT and protease inhibitors, and the lysate was fractionated on 
Ni-Sepharose and HiTrap Heparin columns. After overnight cleavage of the GST 
tag with the HRV 3C protease at 4°C, eIF2a was purified on the HiTrap Heparin 
and Superdex 200 columns in 20 mM HEPES-KOH buffer (pH 7.5), containing 
150mM KCl, 5% (v/v) glycerol and 1mM DTT. 

In vitro phosphorylation of the a-subunit of e[F2. The active human PKR 
protein was prepared as previously described, with some modifications*””*. Briefly, 
the full-length PKR protein was expressed in E. coli Rosetta2 (DE3) cells as the 
N-terminally His.-tagged form, and dephosphorylated in vivo by co-expression 
with \ protein phosphatase. Transformed cells were grown in LB medium sup- 
plemented with 0.2% glucose at 37°C. After the addition of 0.3mM IPTG when 
the culture reached A¢oo nm = 0.6, the cells were grown at 20°C for 24h. The cells 
were lysed by sonication in 20 mM HEPES-NaOH buffer (pH 7.5), containing 
50mM NaCl, 10% (v/v) glycerol, 1mM DTT and protease inhibitors, and the 
lysate was centrifuged. The supernatant was fractionated on Heparin-Sepharose, 
Ni-Sepharose, HiTrap SP and Superdex 200 (GE Healthcare) columns. For the 
preparation of P-eIF2a and eIF2(aP), PKR and S. pombe elF2a were dialysed 
against 20 mM HEPES-KOH buffer (pH 7.5), containing 50mM KCI, 5mM MgCh, 
0.1mM EDTA and 1mM DTT, and K. pastoris eIF2 was dialysed against the same 
buffer containing 100 mM KCl. PKR was concentrated to 2mg ml", and activated 
by an incubation with 0.5mM ATP at 30°C for 1h. e[F2a and eIF2 were phos- 
phorylated by an incubation with the activated PKR and 0.5mM ATP at 25°C. 
Phosphorylation was confirmed by Phos-tag SDS-PAGE (Wako). 
Guanine-nucleotide exchange assay. The elF2—-[7H]GDP binary complex was 
formed by incubating 90 pmol of K. pastoris eIF2 with 37.5 pmol of [7H]GDP in 
assay buffer (20 mM HEPES-KOH buffer (pH 7.5) containing 100 mM KCI, 10% 
(v/v) glycerol, 0.1 mM EDTA, 5mM Nak, 1mM DTT and 2mg ml! BSA), at 21°C 
for 10 min. After the addition of 3mM MgCl, and 0.1 mM ATP, the elF2-[>H]GDP 
complex was further incubated with or without 1 j1l of the activated PKR for 5 min, 
and kept on ice. After a 5 min incubation at 15°C, measurements were started by 
the simultaneous additions of a 100-fold amount of GDP and 22.5 pmol of eIF2B. 
At each time point, samples were transferred into 2.5 ml of ice-cold wash buffer 
(20mM HEPES-KOH buffer (pH 7.5) containing 100 mM KCl and 5mM MgCl), 
and immediately vacuum-filtered through mixed cellulose ester filters (Advantec). 
After two washes with 2.5 ml of the ice-cold wash buffer, the filters were dried and 
the radioactivity was quantitated by liquid scintillation counting. 


Size-exclusion chromatography analysis. A mixture of 500 pmol of S. pombe 
eIF2B and 1 nmol of K. pastoris eIF2 was incubated in 20 mM HEPES-KOH buffer 
(pH 7.5), containing 300 mM KOAc, 5% (v/v) glycerol, 3mM Mg(OAc)s, 0.1 mM 
EDTA and 1mM DTT, at 4°C. After the sample volume was adjusted to 200, it 
was fractionated on a Superose 6 column (GE Healthcare). 

Isothermal titration calorimetry. S. pombe elF2B and elF2a were applied to 
Sephacryl S-300 and Superdex 200 columns, respectively, in 20 mM HEPES-KOH 
buffer (pH 7.5), containing 300 mM KOAc, 5% (v/v) glycerol, 3mM Mg(OAc)> 
and 0.1mM EDTA. The measurements were performed with a MicroCal Auto- 
iTC200 calorimeter (Malvern). The titration was performed by injecting 2 1] of 
eIF20 (200)1M) into the eIF2B solution (201M), 19 times. To calculate the binding 
stoichiometry and the dissociation constants, two runs were concatenated. 
Crystallization and structure determination. The native and SeMet-derivative 
elF2B samples were concentrated to 5-6 mg ml |, and their crystals were obtained 
and cryoprotected in similar manners to those described in Kashiwagi et al. (sub- 
mitted). The final data sets were collected at BL41XU of SPring-8 (Hyogo, Japan). 
Data collection was performed at 100 K, and the wavelengths were 1.0000 A for the 
native data set and 0.9792 A for the SeMet-derivative data set. The data sets were 
processed with XDS*4 and SCALA*. Data collection statistics are summarized 
in Extended Data Table 1. The initial phases were determined from the SeMet- 
derivative data set by the single-wavelength anomalous dispersion (SAD) method, 
by the use of the autoSHARP pipeline*®. The molecular model was built automat- 
ically with Buccaneer*”. The model was modified manually in Coot*, and refined 
by PHENIX® against the native data set. In the Ramachandran plot, 93.4% of the 
residues in the model are in the favoured region, and 6.0% and 0.6% of the residues 
are in the allowed and disallowed regions, respectively. The electrostatic surface 
potential was determined with APBS“”. 

Surface-scanning site-directed photo-cross-linking assays. The incorporation 
of pBpa was performed by the expanded genetic code method: pBpa was incor- 
porated site-specifically into proteins using an E. coli RFzero strain, in which the 
UAG codon is reassigned to pBpa*!”. For the expression of pBpa-labelled S. pombe 
eIF2B, the codon at a specified position in the expression constructs (Kashiwagi 
et al., submitted) was changed to a TAG triplet, using a PrimeSTAR Mutagenesis 
Basal Kit. For the pBpa-labelled S. pombe elF2q, the GST tag of the expression 
construct was replaced by MBP, and one codon was changed to TAG. The pro- 
teins labelled with pBpa were produced with the BL21 (DE3)-based RFzero strain, 
expressing pBpaRS and UAG-decoding tRNA*!”. The cells were grown in LB 
medium supplemented with 0.2% glucose and 1mM pBpa, at 37°C. When the 
culture reached A¢o0 nm = 0.8, 0.5mM IPTG was added, and the cells were grown 
at 20°C for 24h. The pBpa-labelled eIF2B and eIF2«a were purified with Amylose 
Resin (New England Biolabs), and the MBP tags were cleaved with HRV 3C pro- 
tease. For the cross-linking experiments between eIF2B and eIF2q, the proteins 
were dialysed against 20 mM HEPES-KOH buffer (pH 7.5), containing 150 mM 
KCl, 5% (v/v) glycerol and 1 mM DTT. For the cross-linking experiments between 
elF2B and K. pastoris elF2, the proteins were dialysed against 20 mM HEPES-KOH 
buffer (pH 7.5), containing 150 mM KOAc, 5% (w/v) glycerol, 3mM Mg(OAc)s, 
0.1mM EDTA and 1mM DTT. The pBpa-labelled protein was divided into ali- 
quots corresponding to the amount purified from 50 ml LB culture, and mixed 
with 100 pmol of the cross-linking target protein, and then the total volume was 
adjusted to 110,11. The final concentrations of the pBpa-labelled proteins were 
about 0.5-1.0|1M. A series of protein variants, each with pBpa incorporated at one 
specified surface site, was ultraviolet-irradiated in the presence of the cross-linking 
target protein. Ultraviolet irradiation at 365 nm wavelength was performed on ice 
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Extended Data Figure 1 | Biochemical properties of recombinant 

S. pombe eIF2B. a, Guanine-nucleotide exchange catalysed by S. pombe 
eIF2B, and its inhibition by eIF2 phosphorylation. K. pastoris eIF2 was 
labelled with [7H]GDP and incubated with ATP and activated PKR 
(eIF2(aP); red line), or with ATP only (eIF2; blue line). Reactions were 
started by the addition of excess unlabelled GDP with eIF2B (solid line) 
or buffer (dashed line). Individual data points of triplicate analyses are 
shown by dots and boxes. The dissociation rates of GDP from eIF2 are 
summarized in the table. b, The SEC profiles of respective proteins 

(S. pombe elF2B (green), K. pastoris eIF2 (cyan), eIF2(aP) (pink)) (top), 


11.79 


10 11 12 13 14 15 16 


a el 


ee at et mt nk ee th 


7 


PKR 


18 


ycal/sec 


rate (pmol/min) 


GDP dissociation 
elF2 + buffer 21.3 
elF2 + elF2B 63.1 
elF2(aP) + buffer 22.1 
elF2(aP) + elF2B 37.9 


1st injection round 
Time (min) 
0 10 20 30 40 


2nd injection round 


50 


T T T 


0.00 + rn prt iA it mT 4 


0 10 20 30 40 50 
T T T T T T 


-0.10 4 4 4 
-0.20 4 4 J 
-0.30 4 4 HI 
-0.40 4 4 4 
0.00 4 ental J 
5 2.00 4 4 
8 = 
© 4004 “ 4 
re) Pa 
‘ +00; ™ 4 
S elF2B + elF2a 
= 8.00 N= 1.82 + 0.12 
| 7 Kd = 2.14 + 0.79 uM 
10.00 T T T T T T T T T T T 
0.0 05 1.0 1.5 2.0 25 3.0 35 4.0 4.5 5.0 
Molar Ratio 
ty) 10 20 30 40 50 0 10 20 30 40 50 
T T T T T T T T T T T T 
0.00 4 Z| 
-0.10 4 4 
-0.20 4 4 
-0.30 4 a 
-0.40 4 4 
T T T T T T T T T 
0.00 4 ne"a_c "=e" - 
= 
-2.00 4 7 a 
a 
a 
~Ai00- ’ 1 elF2B + P-elF2a 
i N = 1.80 + 0.03 
6.00 4 s 4 Kd = 533 + 84 nM 
8.001 pa 4 
-10.00 


00 05 10 15 20 25 3035 40 45 50 
eIF2B with K. pastoris elF2 (middle) and elF2B with elF2(aP) (bottom). 
The chromatograms of the absorbance at 280 nm and the Coomassie blue 
stained SDS-PAGE gels are shown. For gel source data, see Supplementary 
Fig. 1. c, ITC measurements between S. pombe eIF2B (as the decamer) 

and S. pombe elF2a (top), eIF2B and P-eIF2a (bottom). We used the 
eIF2q subunit alone, rather than the trimeric eIF2, in these experiments 
because of the difficulty in the preparation of highly concentrated eIF2. 
Representative thermograms are shown. Two runs were concatenated for 
analysis. The binding stoichiometry (N) and the dissociation constant (Kg) 
were calculated from triplicate analyses (mean + s.d.). 
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Extended Data Figure 2 | See next page for figure caption. 
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Extended Data Figure 2 | Architecture of the eIF2B subcomplexes. 

a, b, Two different views of the crystal structure of S. pombe elF2B 
(wall-eyed stereo view), coloured as in Fig. 1a and from the same view 

as in Fig. 1b and Fig. 1c, respectively. The assembly of the subcomplexes 

is primarily mediated by the B-< and 6-7 interactions. In this regard, the 
results of the two prior mutational analyses!” are consistent with the 
present structure, except that one suggested that the 3-helical region of the 
‘-subunit is involved in the inter-subunit interactions. c-k, Ribbon models 
of the structures of S. pombe eIF2B subcomplexes (c-e, g, h, j) and proteins 
with structural similarity (f, i, k). c, d, The «2 homodimer (c) and the 8 
heterodimer (d) in the e[F2B decamer. The conformations of each subunit 
are similar to those in the human a homodimer*® and the Chaetomium 
thermophilum B6 heterotetramer’, even though the relative orientations 

of the subunits in the dimers are slightly different from those represented 
in these partial structures. e, f, The regulatory subcomplex in the eIF2B 


decamer (e) and ribose-1,5-bisphosphate (R15Pi) homohexamer“ (PDB 
3A11) (f). The architecture of the regulatory subcomplex is an assembly of 
three similarly shaped dimer moieties: one homodimer of the a-subunit 
and two heterodimers of the 3- and 6-subunits. The arrangement 

of the regulatory subunits resembles that in the C. thermophilum 88 
heterotetramer’, and shares some similarity to that in the homohexameric 
structure of R15Pi**. g, The y-subunit of eI[F2B. h, The c-subunit of elF2B. 
i, Potato tuber ADP-glucose pyrophosphorylase (AGP) (PDB 1YP3)*°. 
The dimerization interfaces between the catalytic subunits are coloured 

in deeper shades (g, h). j, k, The subunit heterodimerization mode in 

the catalytic subcomplex of eIF2B (j) and the subunit homodimerization 
mode in the potato tuber AGP (k). The dimerization manner of the y- and 
¢-subunits is novel: each of their structures resembles the subunit structure 
of the AGP homotetramer*, but they dimerize through their N-terminal 
domains, in a different manner than the AGP homotetramer”’. 
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Extended Data Figure 3 | Mapping of the residues corresponding to 
missense VWM mutations on the subunit interfaces and the distal 

face of eIF2B. The eIF2B residues corresponding to VWM-causing 
missense mutations in human (Supplementary Table 1) are mapped on the 
S. pombe elF2B structure, with the same subunit colouring as in Fig. 1. 
The S. pombe eIF2B residues corresponding to VWM-causing missense 
mutations are shown in parentheses. a, VWM-related residues are mapped 
as spheres on the overall structure (ribbon model). The environments of 
the residues are colour-coded (green, solvent-exposed; yellow, subunit 
interface; brown, structural core) on the spheres. b, VWM-related residues 
are mapped on the surface model of the inter-subcomplex interfaces on 
the catalytic subcomplex side (left), with the interfaces for the w-, 3- and 
-subunits coloured blue, cyan and green, respectively, and on the 3 
dimer side (right), with the interfaces for the y- and e-subunits coloured 
orange and pink, respectively. VWM-related residues around the 8 
dimerization interface are shown as spheres in the inset. The mutations 

in the regulatory subunits are clustered around the dimerization interface 
between the 3- and 5-subunits, as mentioned in ref. 3. Our structure 
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further revealed that the binding site for the c-subunit is formed by the 
correct interaction between the 8- and 6-subunits, thus explaining the 
abundance of mutations around this interface. c, VWM-related residues 
in the a2 homodimer are located around the homodimerization interface, 
and shown as spheres in the inset. These VWM-related mutations around 
the subunit interfaces (b, c) may cause appreciable degrees of subunit 
dissociation from the eIF2B decamer, leading to incomplete complexes, 
destabilization of eIF2B resulting in aggregation/degradation, and/or 
changes in the conformation and activity of the intact eIF2B decamer. 

d, VWM-related residues on the distal face of the catalytic subcomplex 
are mapped on the surface model. The NF motif is shown in red. 

Several VWM-related residues are located near the NF motif, including 
Arg111(2Be), corresponding to the human Arg136His(2Be) mutation, for 
which the mouse model is available*®. e, The ¢-subunit further contains 
several exposed missense VWM mutations, especially in the 6-helical 
domain. f, VWM-related residues in the central cavity. Only one residue, 
corresponding to the human Lys110Glu(2Ba) mutation, is exposed to the 
solvent. 
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Extended Data Figure 4 | Photo-cross-linking between pBpa-labelled 
eIF2B and eIF2. a, The S. pombe elF2B variants bearing a single site- 
specific pBpa substitution in the catalytic subunits were mixed with 

K. pastoris elF2, and irradiated with ultraviolet (365 nm) for 5 min on ice. 
Since eIF2 harbours the Flag-tag at the C terminus, the products cross- 
linked with eIF2 were detected by western blotting with an anti-Flag 
antibody. Site-specific slow-migrating bands that appeared after ultraviolet 
irradiation were judged as cross-linked bands. The relevant bands are 
indicated with teal dots. b, The elF2y-cross-linked sites are shown in teal, 
except for the selected ones explained below, and the cross-link-negative 


1.89 1.94 1.68 1.29 0.84 


sites are shown in grey, on the surface model of the catalytic subcomplex, 
coloured in the same manner as in Fig. 1. The NF motif is shown in red. 

c, Time-course analysis of cross-linking with eIF2,. Four selected sites 

on the distal face (Gln117(2Be) (light green in b), Leu257(2Be) (blue), 
Glu204(2Be) (purple) and Ser258(2B,) (violet)) were examined by time 
courses of the cross-linking with eIF2 or eIF2(aP). The ratio of the band 
intensity of the eIF2B 4- or c-subunit cross-linked with eIF2(aP) to that 
with unphosphorylated eIF2 at each time point is shown below the lane. 
ND means that no band of the eIF2B subunit cross-linked with eIF2(aP) 
was detected at the time point. For gel source data, see Supplementary Fig. 1. 
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Extended Data Figure 5 | Photo-cross-linking between pBpa-labelled 
eIF2B and eIF2a. a, The S. pombe elF2B variants bearing a single 
site-specific pBpa substitution in the regulatory subunit were mixed with 
S. pombe P-elF2a, and irradiated with ultraviolet (365 nm) for 5 min. 
Cross-linking was detected as described in Extended Data Fig. 4. The 
relevant bands are indicated with orange dots. b, The eI[F2a-cross-linked 
sites are shown in orange, and the cross-link-negative sites are shown 

in grey, on the surface model of the overall structure of eIF2B, with the 


-— . 
F735 subunit 


<a subunit 


same subunit colouring as in Fig. 1. c, The eIF2B variants were similarly 
mixed with unphosphorylated eIF2a and irradiated with ultraviolet. The 
bands that were also observed in a are indicated with orange dots, and the 
unphosphorylated elF2a-specific bands are indicated with magenta dots. 
d, Arg84(2B8) and Gln91(2BQ), which exhibited unphosphorylated 
eIF2«-specific cross-links, are shown in magenta. The view is the same as 
in Fig. 2a. For gel source data, see Supplementary Fig. 1. 
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Extended Data Figure 6 | Solvent-exposed Gcn” mutations located 

in the central cavity and buried Gcn” mutations clustered around 

the trimerization interface. a, The residues corresponding to Gcn™ 
mutations'®?° are mapped in blue on the surface model of the S. pombe 
elF2B structure, with the same subunit colouring as in Fig. 1. b, c, The 
residues corresponding to exposed Gcn™ mutations are mainly located 
in the central cavity (b), and their locations coincide with the P-elF2a 
cross-link sites shown in orange (c, Extended Data Fig. 5b). d, e, The 
residues corresponding to Gcn” mutations on the a, homodimer (d) and 
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the 86 heterodimer (e). The residues of S. pombe elF2B corresponding to 
S. cerevisiae Gcn” mutations are indicated in parentheses. The interfaces 
for the trimerization of the regulatory dimers are coloured grey. Most of 
the Gcn -related residues are mapped only on one face of the subunits, as 
predicted‘**”, The present elF2B structure revealed that these ‘mutation- 
rich’ faces are used for the assembly of the regulatory subunits to form the 
subcomplex, and the Gcn’-related residues are clustered on the interface 
for the trimeric assembly. 
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Extended Data Figure 7 | Cross-linking of pBpa-labelled eI[F2a with 
the central cavity formed by the eIF2B regulatory subunits. a, b, The 

S. pombe elF2q variants bearing a single site-specific pBpa residue were 
mixed with epitope-tagged S. pombe eIF2B, and irradiated with ultraviolet 
(365 nm) for 5 min on ice. Cross-linking was detected as described in 
Extended Data Fig. 4. Cross-links with eI[F2Ba were detected with an 
anti-myc antibody (a), and those with eIF2B$ were detected with an 
anti-HA antibody (b). The relevant bands are indicated with blue dots in 
a, and cyan dots in b. These cross-linked sites are mapped on the human 
elF2« structure”! in Fig. 2b. For gel source data, see Supplementary 

Fig. 1. c, The model of the e[F2B-eIF2a complex, built on the basis of the 
cross-linking experiments, is shown in Fig. 2c. The phosphorylated residue 
Ser51(2q) is highlighted with the magenta circle. d, The electrostatic surface 
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potential of eIF2B, from the same viewpoint as in c. Red and blue colours 
represent negative and positive potentials, respectively, of £10kT/e. 

The cavity has no positively charged patch; therefore, the mechanism 
underlying the enhanced affinity for P-eIF2q is still unclear. One possible 
mechanism is a cation-mediated recognition of the phosphoserine residue. 
The phosphorylated Ser51(2«) may coordinate a cation together with the 
negatively charged residues at the bottom of the central cavity, although 
we did not observe any electron density for such cations in the cavity. 
Another possibility is a phosphorylation-induced conformational change 
of the Ser51-flanking loop. The phosphorylation of Ser51(2a) may induce 
the rearrangement of adjacent arginine residues, and enable a stronger 
interaction with the negatively charged residues at the bottom of the 
central cavity. 
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Extended Data Figure 8 | Analyses of e[F2B Gcn” mutations. 

a, The locations of the Gcn -related residues mutated for the SEC and 
ITC analyses (Glu57(2Ba) and Asp248(2B8)) are indicated on the 

S. pombe elF2B structure, with the same view and colouring as in 
Extended Data Fig. 6b. b, The SEC analyses of the interaction between 

K. pastoris e[F2(aP) and S. pombe eIF2B, bearing Gcn -related mutations. 
The chromatograms of the absorbance at 280 nm and the SDS-PAGE gels 
of each run are shown. The green bar represents the elution point of free 
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eIF2B (Extended Data Fig. 1b). For gel source data, see Supplementary 
Fig. 1. c, ITC measurements between S. pombe eIF2a or P-eIF2a and 
elF2B, bearing Gcn -related mutations. Representative thermograms for 
the ITC experiments are shown. d, e, The nucleotide exchange activities of 
S. pombe elF2B, bearing Gcn -related mutations, on K. pastoris elF2(aP) 
(d) and elF2 (e) were examined as described in Extended Data Fig. la 
(aGlu57Lys mutant, blue line; ’Asp248Lys mutant, green line; wild type, 
grey line). Individual data points of triplicate analyses are shown by dots. 
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Extended Data Figure 9 | The locations of ISRIB-resistant mutations 
on the eIF2B structure. Residues corresponding to the ISRIB-resistant 
mutations” are mapped onto the eIF2B structure in red. The residue 
corresponding to the Arg171GIn(2B$) mutation (Lys112(2B8)) is in the 
disordered region, at the N terminus of the 5-subunit. The disordered 
N-terminal segment of the 6-subunit is indicated by the dotted green line. 
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Extended Data Table 1 | Data collection and refinement statistics 


Data collection 
Space group 
Cell dimensions 

a, b, c (A) 
Resolution (A)* 
Regn” 
Hol* 
Completeness (%)* 
Redundancy* 
CC4 2" 


Refinement 
Resolution (A) 
No. reflections 
Ruonr Rree 
No. atoms 
Protein 
Ligand/ion 
Water 
B-factors 
Protein 
Ligand/ion 
R.m.s. deviations 
Bond lengths (A) 
Bond angles (°) 


*Highest resolution shell is shown in parentheses. 
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144.5, 209.2, 223.5 
49.29-2.99 (3.17-2.99) 
0.140 (1.558) 

13.3 (1.3) 

99.6 (97.9) 

6.3 (6.3) 

0.998 (0.524) 


44.29-2.99 
136110 
0.221 / 0.271 


28539 
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SeMet 


P212:9; 


144.6, 209.8, 223.8 
44.25-3.38 (3.57-3.38) 
0.137 (0.803) 

20.7 (3.7) 

99.8 (98.9) 

14.9 (14.7) 


CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature16180 


Corrigendum: Acute stress 
facilitates long-lasting changes in 
cholinergic gene expression 


Daniela Kaufer, Alon Friedman, Shlomo Seidman & 
Hermona Soreq 


Nature 393, 373-377 (1998); doi:10.1038/30741 


It has been brought to our attention that Figs 1a and 5a of this Letter 
contain some irregularities in the data presentation. Specifically, in the 
three gel images in the top row and in the rightmost gel image in the 
second row of Fig. la, a black area of the gel was cropped and added 
to unify the size of the box in order to match the size of the gel images 
without rescaling. All bands originated from the same gel. The histo- 
grams in the right column show quantification of the data from multi- 
ple experiments. In Fig. 5a, the similarity between the left and right gel 
images in the bottom row may be due to accidental duplication. The 
fact that a long time has passed since publication means it is no longer 
possible to obtain and investigate the raw data. None of these irregular- 
ities affect the original meaning of the experiments, their results, their 
interpretation, or the conclusions of the paper. Author S.S. is deceased. 


Correspondence should be addressed to D.K. (danielak@berkeley.edu), 
AF. (alonf@bgu.ac.il), and H.S. (hermona.soreq@mail.huji.ac.il). 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature16181 


Corrigendum: Identification of 
the pollen self-incompatibility 


determinant in Papaver rhoeas 


Michael J. Wheeler, Barend H. J. de Graaf, Natalie Hadjiosif, 
Ruth M. Perry, Natalie S. Poulter, Kim Osman, 

Sabina Vatovec, Andrea Harper, F. Christopher, H. Franklin & 
Vernonica E. Franklin-Tong 


Nature 459, 992-995 (2009); doi:10.1038/nature08027 


Recently, it has come to our attention that in the left panel of Fig. 2b of 
this Letter, the lanes labelled S.S4 and S¢S17 were duplicated. We have 
reviewed the original data. It seems likely that a duplicated part of the 
blot was placed over lane S¢Sj7 to aid alignment of molecular mass 
markers and inadvertently left there. We have now removed the dupli- 
cated lane and show the whole western blot (Fig. 1). Our conclusions 
are unaffected. 
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Figure 1 | This is the corrected left panel of Fig. 2b. The right panel 
showing the Coomassie staining is not shown. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature16158 


Corrigendum: A SUMOylation- 
defective MITF germline mutation 
predisposes to melanoma and 
renal carcinoma 


Corine Bertolotto, Fabienne Lesueur, Sandy Giuliano, 

Thomas Strub, Mahaut de Lichy, Karine Bille, Philippe Dessen, 
Benoit d’ Hayer, Hamida Mohamdi, Audrey Remenieras, 

Eve Maubec, Arnaud de la Fouchardiére, Vincent Molinié, 
Pierre Vabres, Stéphane Dalle, Nicolas Poulalhon, 

Tanguy Martin-Denavit, Luc Thomas, Pascale Andry- 
Benzaquen, Nicolas Dupin, Francoise Boitier, Annick Rossi, 
Jean-Luc Perrot, Bruno Labeille, Caroline Robert, 

Bernard Escudier, Olivier Caron, Laurence Brugieres, 

Simon Saule, Betty Gardie, Sophie Gad, Stéphane Richard, 
Jér6me Couturier, Bin Tean Teh, Paola Ghiorzo, 

Lorenza Pastorino, Susana Puig, Celia Badenas, Hakan Olsson, 
Christian Ingvar, Etienne Rouleau, Rosette Lidereau, 

Philippe Bahadoran, Philippe Vielh, Eve Corda, 

Hélene Blanché, Diana Zelenika, Pilar Galan, The French 
Familial Melanoma Study Group, Valérie Chaudru, 

Gilbert M. Lenoir, Mark Lathrop, Irwin Davidson, 
Marie-Francoise Avril, Florence Demenais, 

Robert Ballotti & Brigitte Bressac-de Paillerets 


Nature 480, 94-98 (2011); doi:10.1038/nature10539 


In this Letter, one image was mistakenly duplicated during prepara- 
tion of the artwork. In the original Fig. 3d, the left image illustrating 
migration of RCC4 cells transduced with empty adenovirus (EV) at 
24 hisa duplicate of the middle image showing migration of RCC4 cells 
transduced with an adenovirus encoding Mi-WTT. The corrected images 
and migration graph are shown in Fig. 1 of this Corrigendum. This 
correction does not alter any of the conclusions, and the authors apol- 
ogize for any confusion this may have caused. Nature has not received 
a response from the following authors to approve this Corrigendum: 
V.M., T.M.-D., A.Ro., PB., E.C. and V.C., and C. Becuwe., J.-L.B., J.C.-B., 
S.D., C.D., J.L., and K.M. from The French Familial Melanoma Study 
Group (L.D. is deceased). 
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Figure 1 | This is the corrected Fig. 3d of the original Letter. 
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ILLUSTRATION BY THE PROJECT TWINS. 


TOOLBOX 


THE MANUSCRIPT-EDITING 
MARKETPLACE 


A peer-to-peer website aims to disrupt the author-services industry. 


BY JEFFREY M. PERKEL 


s Sebastian Eggert prepared to submit 
A: conference article, he realized he 

had a problem: neither he nor his 
research adviser were native English speakers, 
and neither had much experience in writing 
and publishing research papers. But Eggert, 
a master’s student in mechanical engineer- 
ing at the Technical University of Munich in 
Germany, had heard of a website where he 
could purchase editing services from an 


expert: an online marketplace called Peerwith. 
Launched in October 2015 and still in beta 


testing, Peerwith is a forum through which 
researchers can find and negotiate with service 
providers such as editors, translators, statisti- 
cians and illustrators to improve their research 
papers. The site boasts “hundreds of experts’, 
most of them with expertise in the social sci- 
ences and humanities. Users post a job request 
detailing the subject area of the document, its 
length and the desired turnaround time. Experts 
then bid for the job, and both experts and users 
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rate each other afterwards. Peerwith’s business 
model is akin to freelance marketplaces such 
as Upwork, says co-founder Joris van Rossum, 
who left the journal publisher Elsevier to start 
his firm, except with a strictly academic focus. 
A market for author services on research 
papers already exists; van Rossum estimates 
it at hundreds of million of dollars annually. It 
includes both large editing companies such as 
American Journal Editors (AJE), Edanz, Editage 
and Macmillan Science Communication (MSC, 
which is owned by Nature’s parent company), 
NATURE | 127 
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and freelancers. But a peer-to-peer online 
marketplace, van Rossum says, makes ser- 
vices more affordable by cutting out the mid- 
dleman and efficiently matching buyers and 
sellers. (Peerwith receives a cut of 10-20% 
for each transaction; the other firms would 
not comment on their margins). At the site, 
authors can review the experts who bid for 
work to identify the best fit, and can check to 
see how others have rated them. 

Val Kidd, an editor and translator based in 
the United Kingdom, earned €200 (US$223) 
on Peerwith to translate a presentation for 
Emanuel Rutten, a philosopher at the Free 
University, Amsterdam, in the Netherlands. 
The process, from job posting to completed 
document, took less than two weeks, Rutten 
says. “It’s really smooth.” For her part, Kidd 
says that the interaction with her client 
improved the final product. At most author- 
services companies Kidd works with, she 
says, editors and translators cannot contact 
the author should they have questions — the 
client interacts with the service, which iden- 
tifies a freelancer to handle the job. 

Peerwith doesn't vet its service providers, 
says Anna Sharman, founder of Cofactor, a 
London-based author-services consultancy. 
So, unlike her own and other such compa- 
nies, there is no guarantee that the ‘experts’ 
really are qualified. Editors at Cofactor 
undergo a rigorous recruitment process, 
Sharman says, and she double-checks their 
work before it is returned to the client. 

Sharman says that she could see Peerwith 
as a marketing channel for her business, but 
is concerned that it may foster a “race to 
the bottom” in pricing. She says that when 
she created an account, the only request 
she saw was from someone who wanted a 
5,000-word article edited for US$9, “a ridicu- 
lously small amount”. Sharman charges £60 
($87) per 1,000 words at Cofactor. At Edit- 
age, a 6,000-word article with 1-week turna- 
round costs $350 at the company’s ‘premium’ 
price, and AJE charges $594. And for ‘exten- 
sive’ scientific editing at MSC (by a panel of 
at least four editors with experience at high- 
impact journals), a typical 5,000-word article 
with a 17-day turnaround costs $2,860. 

Peerwith is still getting up to speed, van 
Rossum says. But ultimately, a community- 
based marketplace could succeed “if there's 
the right balance of price and quality’, says 
Deni Auclair, a lead analyst for the media, 
information and technology consulting firm 
Outsell, headquartered in Burlingame, Cali- 
fornia. The larger editorial service providers 
might be left to target institutions more than 
individuals, she suggests. 

As for Eggert, he received one bid to his 
job posting, and paid €100 for style and con- 
tent edits to his 2,500-word paper, which 
he negotiated down from €120. He says he 
would use the service again, and recommend 
it to others — assuming the price is right. m 
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EXPERIMENTATION 


Better designs for 
animal studies 


A web application aims to improve life-sciences research. 


BY DANIEL CRESSEY 


free online tool that visualizes the 
Ae of animal experiments and 

gives critical feedback could save 
scientists from embarking on poorly designed 
research, the software's developers hope. 

Over the past few years, researchers have 
picked out numerous flaws in the design 
and reporting of published animal experi- 
ments, which, they warn, could lead to bias. 
In response, hundreds of journals have agreed 
to voluntary guidelines for reporting animal 
studies: checklists of best practice, such as what 
statistical calculations to use to ward off error. 

But these lists kick in after scientists sub- 
mit a paper, says Nathalie Percie du Sert, who 
specializes in experimental design at the 
National Centre for the Replacement, Refine- 
ment and Reduction of Animals in Research 
(NC3Rs) in London. “When you get to the 
reporting stage, that's a bit too late,” she says. 
“We want researchers to think about these issues 
at the design stage.” 

Percie du Sert’s solution is a programme 
called the Experimental Design Assistant 
(EDA), which launched in October 2015. She 
hopes that it will help to improve the quality of 
animal research and perhaps even become an 
integral part of the conduct of animal studies. 

The EDA (go.nature.com/koasai) allows 
scientists to create a visual representation of 
an experiment by laying out its key elements 
— hypothesis, experi- 
mental method and 


3. Wewant 
planned analysis — in Vogouehovede 
logically connected, Hdnkabout thes’ 
coloured boxes. The ~ 
issues at the 


software then uses a 
built-in set of rules to 
spot potential prob- 
lems, and suggests refinements. These may be 
simple — the researcher hasn't specified how to 
randomize animals to the control or treatment 
arm — or more complex: there are potential 
confounding variables in the control and trial 
arms. The tool can also assist scientists with cal- 
culating the sample size needed to ensure a sta- 
tistically robust result, or with randomization. 
There’s nothing fundamentally new in the 
EDA, says Percie du Sert. It builds on existing 
knowledge of good experimental design. But it 
can aid scientists who have little training in the 
area, she says, and teach them design choices. 


design stage.” 
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Since the EDA’s launch, around 400 accounts 
have been created to use it, producing between 
50 and 100 experimental diagrams in total each 
month, says Percie du Sert. She does not have 
access to detailed information about its users; 
the sensitivities around animal research and the 
need to protect researchers mean that data on 
who is using it, and how, are secured. 

The Wellcome Trust’s Sanger Institute, a 
genome-research centre in Cambridge, UK, is 
rolling out an internal training programme that 
includes lessons on design and use of the EDA; 
the agency is encouraging staff to use the soft- 
ware to present experiments to ethical-review 
committees, says Natasha Karp, a biostatistician 
at the institute. Karp took part in the working 
group that oversaw the tool’s development, and 
says that she uses it to visualize the experiments 
of the biologists whom she supports. 

The EDA is not the only software that aims 
to improve research quality and reproducibility. 
Other tools check manuscripts before publica- 
tion for issues such as errors in formatting or 
omission of P values. These include Penelope, 
a paid-for service aimed at journal publishers; 
another tool called WebCONSORT (which is 
not yet freely available) is being tested as a way 
to improve reporting of clinical trials. Protocol 
Navigator, a free web application created by sci- 
entists at Cardiff University, UK, also produces 
visual experiment maps that can be shared. 

But the EDA specifically targets animal 
research, and as such, is unique in its ability to 
give a rapid overview of the design and analysis 
of animal experiments, says Karp. “There isn’t 
anything else quite like this system.” 

Percie du Sert hopes that a visual represen- 
tation of experiments could become com- 
mon practice, used in research papers or lab 
meeting presentations. Eventually, the EDA 
might even produce time-stamped versions to 
prove that an experiment was conducted and 
analysed as designed, she adds, rather than 
being the product of a scientist searching for 
meaning in data after the fact — a frowned- 
upon practice sometimes called HARKing 
(‘hypothesizing after the results are knowr). 

The online tool can seem a little compli- 
cated, says Jeffrey Mogil, who studies pain at 
McGill University in Montreal, Canada. “But I 
actually think that people might get a big kick 
out of using this,” he says. “It looks like a cool 
way to break in new grad students or teach the 
scientific method to undergrads.” = 
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Lone-parent scientist 


Limited institutional resources mean that single parents often need a network of support to 


further their scientific careers. 


BY HELEN SHEN 


n the Saturdays when immunolo- 
() gist Patricia Castillo has an experi- 

ment on the boil, she hops into her 
car and heads to the laboratory at 6 a.m. A 
postdoctoral researcher and a single mother, 
she aims to return home before her two sons — 
one aged 13, the other aged 9 — notice that 
she has gone. 

Castillo and the father of her children 
separated in 2010, while she was in her third 
year of graduate school at the University of 
California, Davis. Both of her boys needed 
before- and after-school day care back then, 
and the expense ate up two-thirds of her stu- 
dent stipend. She received multiple childcare 
grants that were available to student parents at 
the university, and took out student loans that 
reached a total of US$80,000. But after she fin- 
ished graduate school, it got worse for Castillo: 
as a postdoc, she no longer qualified for the 
grants or student loans. She had to withdraw 
her younger son from before-school day care 


and now must juggle her schedule to make 
things work. In practice, this means that she 
arrives later at the lab in the mornings and 
spends more of her weekends catching up on 
tasks. “Financially, it’s still kind of stressful. At 
the end of the month, I’m just barely making it 
till the next pay period, says Castillo. 

Castillo’s experiences as a single parent are 
not unusual. Scientists of all stripes, whether 
they are graduate students and postdocs like 
Castillo or senior researchers in academia and 
industry, face a shortage of dedicated resources 
to help them to balance the demands of both 
their career and family. Although some receive 
help from the ex-partner with whom they 
share custody, many will go it alone. 

When childcare expenses or demands pile 
up, experiments run over or children fall ill, 
scientists might also turn to family, friends 
or hired helpers to get it all done. Many, like 
Castillo, spend more on childcare and other 
help than they can comfortably afford. And 
because research fellowships, programmes 
and policies that specifically support single 
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parents are rare, most scientists must look for 
their own solutions. 

As they improvise, some have found 
unexpected support — and have even 
negotiated flexible working — by opening 
up to their supervisors about the challenges 
that they face. A number of universities offer 
limited childcare subsidies to postdocs and 
employees, as well as their students, through 
institutional benefit programmes. But there is 
no one-size-fits-all solution, and single-parent 
scientists must find creative ways to meet the 
needs of their family. 


CREATIVE PARENTING 

It took a network of helpers, alot of money and a 
sympathetic senior colleague to help astrophysi- 
cist Sara Seager to keep her head above water 
after she became a single parent in 2011. At the 
time, she also had a new job to learn — fast. A 
professor at the Massachusetts Institute of Tech- 
nology (MIT) in Cambridge, Seager’s first hus- 
band died of cancer when their sons were 6 and 
8 years old. Until then, her husbandhadbeen > 
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> the one to handle grocery shopping, cooking 
and other household tasks while she focused on 
her single-minded search for Earth-like planets. 

Emotionally and physically exhausted, Seager 
invited a friend to live with her in exchange for 
getting her boys ready in the morning and tak- 
ing them to school. She also hired a babysitter to 
pick up and stay with the children after school. 
And she employed a housekeeper to prepare 
food and to clean the house. “I spent more than 
learned, and I had to plough through my sav- 
ings,’ she says. 

When the dean of the school of science at 
MIT enquired about how she was doing one 
day in 2012, Seager admitted that she was strug- 
gling. To her surprise, he asked what she would 
need financially to make things work. MIT then 
offered her a salary supplement that will last 
until her youngest child turns 16. (Seager subse- 
quently won a no-strings, $625,000 MacArthur 
fellowship and has remarried — further helping 
her to cope with the demands of family life.) 


“As a single mom, you really have to 
streamline your life more than most other 
people do. Delegating and spending money 
where you can just helps so much,’ says 
Seager. “You can't do it all alone” And as she 
discovered, existing networks of friends and 
neighbours can also be a valuable source of 
emotional and practical support. 


LOCATION CHANGE 
For biologist Florence Roullet, sharing the 
custody of her children and openly communi- 
cating with her employer have been crucial for 
achieving a positive work-life balance. She and 
her partner, also a scientist, split in 2008, but 
the pair have continued to live near each other 
— even moving from Canada to the United 
States and back — to share the parenting of 
their twin sons. 

At the time of the break-up, Roullet was 
preparing to leave Canada for a postdoctoral 
fellowship at the US National Institutes of 


GROUNDED WITH CHILDREN 


Travel assistance for scientists with families 


Attendance at conferences and other 
gatherings of scientists is often an important 
part of the research process, but it can 

also be one of the most difficult aspects 

for scientists who are single parents. “Most 
people have no concept of how hard travel 

is for a single parent,” says Sara Seager, 

an astrophysicist at the Massachusetts 
Institute of Technology in Cambridge. For 
two years after she was widowed, she cut 
back on going to conferences and invited 
seminars. The cost of hiring round-the- 
clock carers to stay with her children, as 

well as the emotional stress of worrying 
about her family from afar, outweighed the 
professional benefits. But Seager notes that 
she was far enough into her career to not 
worry about jeopardizing her professional 
advancement. “I could just say no to things. 

Scientists who are at an earlier stage of 
their career face a tougher choice. They 
often feel that their advancement is tied to 
an ability to follow research opportunities 
wherever they arise, or to travelling to 
conferences where they can share their 
findings with more-senior scientists. 

To help parents to attend research 
conferences, some universities have 
established grant programmes that partially 
offset the associated costs of childcare. 

For example, postdoctoral researchers 

and assistant professors from the David 
Geffen School of Medicine at the University 
of California, Los Angeles, can apply for 

a travel award of up to US$500 to cover 
childcare, travel and registration for 
professional meetings. And Yale University 


” 


130 | NATURE | VOL 531 | 3 MARCH 2016 


in Connecticut offers reimbursements of up 
to $1,000 per year for childcare expenses 
that relate to travel for postdocs and certain 
faculty members. Similar programmes exist 
at institutions such as Brown University 

in Rhode Island, Harvard University in 
Massachusetts, Northwestern University in 
Illinois and Stanford University in California. 

A number of scientific societies also offer 
support for parents who wish to attend 
their annual conferences. This year, for 
example, the Entomological Society of 
America in Annapolis, Maryland, will offer 
on-site childcare for children who are aged 
2 or older — at no cost. (Usually, the society 
offers grants of up to $400 toward childcare 
expenses during its conferences.) 

And the US Society for Neuroscience 
and the American Geophysical Union, 
both in Washington DC, provide childcare 
programmes and services for a fee at their 
annual meetings. Other organizations 
offer assistance with childcare costs. The 
American Society for Biochemistry and 
Molecular Biology in Rockville, Maryland, 
provides grants of up to $1,000, and the 
UK Microbiology Society offers grants of up 
to £1,000 ($1,400), to help parents with 
costs that are associated with attending one 
society meeting per year. Many such awards 
do not cover children’s airfare or other 
travel costs. 

To learn more about options for travel 
support and to verify eligibility for specific 
childcare grants and services, single-parent 
scientists should contact their present 
institution or scientific society. H.S. 
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Health (NIH) in Bethesda, Maryland. With 
two four-year-old children to care for, “it was 
a tough time,’ she recalls. But in an unconven- 
tional move, Roullet’s former partner agreed to 
relocate, and he also secured a job at the NIH. 
They began to alternate the weeks in which 
each looked after the children — a schedule 
that continues to this day. 

Roullet was upfront about her family 
responsibilities before starting her job at 
a biotechnology company in Burlington, 
Canada, where she coordinates clinical trials 
and oversees regulatory applications. Although 
her work duties mostly fit into a nine-to-five 
schedule, Roullet does need some flexibility 
when she is looking after the children. “I had 
avery clear discussion with the person hiring 
me that I had children: “This is my reality. They 
will be sick, and I will have to go home early 
sometimes,” she says. 

She packs meetings, work functions and 
personal plans into the weeks that her ex- 
partner has their children. When they are liv- 
ing with her, Roullet sometimes works from 
home or at night — with the approval of her 
manager, who supports the flexible working 
arrangement as long as she continues to meet 
her goals and deadlines. “There are times when 
I need to work all night to finish up something, 
and I will do that,” she says. 

Research fellowships and grants that are 
aimed at single parents are scarce, although 
some programmes do exist to help parents to 
attend conferences for short periods of time 
(see ‘Travel assistance for scientists with fami- 
lies’). And longer-term assistance is available 
through a small number of programmes that 
are designed to help scientists with significant 
carer responsibilities. 

Lily Asquith, a particle physicist in Brighton, 
UK, secured one such award at a crucial 
moment. When her daughter Jessie suddenly 
became ill, Asquith was forced to consider 
whether to leave the research career she had 
worked long and hard to establish. 

A single parent from the age of 19, Asquith 
was living in a women’s shelter when she 
decided to take evening classes in mathemat- 
ics and physics. She was then accepted into an 
undergraduate degree programme at Univer- 
sity College London. Her low income meant 
that she could put her daughter in day care 
for a reduced fee, and after Asquith received 
her PhD in 2010, the pair moved to the United 
States so that Asquith could pursue postdoc- 
toral research at Argonne National Laboratory 
in Lemont, Illinois. 

In 2012, when Asquith’s work took her to 
CERN, the European laboratory of particle 
physics near Geneva, Switzerland, her now 
teenage daughter decided to stay with her 
aunt in the south of England. So when Jessie 
was diagnosed with ulcerative colitis in 2013, 
Asquith was frantic to return to the United 
Kingdom, even without the prospect of 
continuing her research. 
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But in 2014, just as she was about to 
accept a job at a data-science company, 
she learned that her application for a Royal 
Society Dorothy Hodgkin Fellowship had 
been accepted. The five-year research award 
helps scientists with considerable carer 
responsibilities or health issues to pursue 
flexible working arrangements. The fellow- 
ship enabled Asquith to move her research 
to the University of Sussex in Brighton. 
“Tt was a real life-changer,” she says. Now, 
she can stay on top of her research as well 
as spend time with Jessie, who has been in 
remission for the past few months. 

Although Asquith has been able to con- 
tinue her work without interruption, other 
scientists who are single parents might need 
to take a break of up to several years to tend 
to their families. For those researchers, 
the Daphne Jackson Trust in Surrey, UK, 
offers a fellowship that helps scientists to 
update their skills and return to research 
after a break. The NIH and the US National 
Science Foundation also offer options that 
enable scientists who take a leave of absence 
to extend the funding period of grants. 


EXTENDED FAMILY 

The demands of work and childcare can be 
all-consuming for a single-parent scien- 
tist. But taking care of their own emotional 
needs should bea priority, too. “Social sup- 
port is really important for single parents,” 
says Seager. “You need other single parents. 
You need to find your demographic.” 

For Seager, that clan was an informal 
support network for widows in the town 
where she lives. The women met regularly 
for coffee and commiserated while trading 
parenting advice and offering each other 
emotional support. Seager also found sup- 
port from within the lab. Her research group 
rallied round after her husband’s death and 
became a sort of extended family. Often they 
would go on holiday with Seager and her 
children as an extension to conference trips. 
Back home, the group would venture out 
on weekend hikes. “The lab played a huge, 
amazing role,’ Seager says. “Ultimately, it’s 
really about finding a social network. If you 
don't have family to rely on, it’s the friends 
who can step in and take care of your kids 
and provide another kind of support” 

Scientists who are single parents say that 
although the sacrifices and struggles can be 
arduous, the rewards are worthwhile. And 
the fulfilment that stems from maintaining 
a research career in difficult circumstances 
can help scientists to become more effec- 
tive parents. “I wouldn't have done all this.” 
says Asquith, “if it hadn't been for the ambi- 
tion to be the kind of parent I wanted for 
my daughter.” = 


Helen Shen is a freelance writer in 
Sunnyvale, California. 


TURNING POINT 


CAREERS 


Gun-crime analyst 


Garen Wintemute has spent his career — and 
more than US$1 million of his own funds 

— studying firearm violence. The physician- 
scientist at the University of California, Davis, 
reports that a new generation of gun-violence 
researchers is emerging as funding picks up. 


Were funds available when you began this work? 
Yes. In the late 1980s, rates of firearm violence 
were rising rapidly, and Congress made research 
funds available to attract people to the field. But 
in 1996, that mobilization effort was choked off. 
It was never an outright ban on research: then- 
US Representative Jay Dickey (Republican, 
Arkansas) introduced an amendment stating 
that funds from the Centers for Disease Control 
and Prevention could not be used to advocate 
for or promote gun control. But the writing on 
the wall was ‘don't fund the research: That was 
applied to budget appropriations, including for 
the US National Institutes of Health (NIH). My 
group and others lost funding. 


How were you able to continue your research? 
We had to let people go, but we secured funding 
from the National Institute of Justice and private 
foundations. It wasn't enough. 


Did that prompt you to use your own funds? 

I started to spend my own money in 2005 
because I wanted to bring people together 
and keep this work going. Some of it can only 
be done in California, because we collect 
high-quality individual-level health and crim- 
inal-justice data related to firearm violence. In 
2014, I wrote a pledge to give more over the next 
4 years, up to $2 million. 


What were your key findings? 

One project established that, for people who 
buy guns legally, previous convictions for 
violent misdemeanours confer great risk for 
future violence. We also did the first prospective 
study tracking handgun purchasers and their 
incidence of violence: in the first week of gun 
ownership, the risk of firearm suicide is 57 times 
higher than expected for adults in California. 


Has the funding situation changed? 

Yes. Last year, Dickey said he has regrets — he 
meant for the amendment to cut off advocacy, 
not research. On 11 February, he expressed sup- 
port for California legislation to establish a Fire- 
arm Violence Research Center at the University 
of California. The person who had the most to 
do with funding being cut off is in a uniquely 
influential position to advocate for its increase. 
As it stands now, the NIH is funding research. 
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What is the current status of the field? 


I used to worry about who would do this 
research after I retired. There were maybe 12 of 
us around the country, all ofa similar age. With- 
out funding, there was too much uncertainty for 
most people to enter the field. But I don’t worry 
about that anymore. We're now hiring three 
nationally ranked junior faculty members to 
join us as investigators, and launching a fellow- 
ship programme. 


What are your conversations with early-career 
researchers like? 

The field is controversial and can be physically 
risky. We get hate mail and death threats. But 
there’s plenty of intellectual elbow room and 
hugely important questions nobody is answer- 
ing. Since the 2012 Sandy Hook shooting in 
Newtown, Connecticut, I have heard from all 
kinds of people — undergraduates to early- 
career professors — asking how they can help. 


How do they feel about the risks? 

People have become more tolerant of the risks 
involved in this work. When there is something 
preventing research from being done, that thing 
feels like a bully. And no one likes a bully. 


Do you think the field will continue to grow? 
Although mass shootings haven't resulted in 
congressional action, research funders have 
stepped up. I think we're still at the beginning of 
the beginning of a long-term change in the way 
the country thinks about firearm violence. We're 
setting up the infrastructure and labour force to 
keep this work going. All that said, compared to 
the need, the situation is still very grim. m 


INTERVIEW BY VIRGINIA GEWIN 
This interview has been edited for length and clarity. 
See go.nature.com/er8c4g for more about his work. 
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tunnels. Bart is sweeping 

the walls vigorously, trying 
to warm himself up inside his 
stiff uniform. He scans every 
crack he comes across for con- 
taminants, seals it, then moves 
on, sweeping and scanning, 
scanning and sweeping. The 
job is repetitive; he’s heard of 
people whovve been driven mad 
by the repetition of it, lost their 
grip. But Bart doesn't mind. If 
only it weren't so cold. If only 
there were some Sun in the tun- 
nels. But of course there ist. 

“Sun is precious, Sun is rare,’ 
he whispers behind his mouth 
cover. “Sun is for the wor- 
thy? And for Ajdenia, he adds 
silently. 

He tries to recall the warmth 
of Sun on his skin. He almost 
succeeds. 

And the worthiest of all get to 
live above, showered in sunlight. 

He finishes the length of 
tunnel number 8 and enters the Ra intersec- 
tion. All the intersections are named after 
long-lost deities associated with the Sun. The 
next one is called Helios. Then there’s Tona- 
tiuh, Solar Logos, Surya. Bart knows them 
all, every name, every inch. He always works 
methodically. He's good at what he does. He's 
worth five minutes in the filter room, under 
the Sun. Maybe next year he'll be worth five 
anda half. Or six, even. 

There’s another employee in tunnel 
number 9. Bart takes a moment to observe 
them. Their uniforms are identical. Their 
mouth masks, their goggles, their hoods. He 
wonders what that person looks like under- 
neath. He wonders what they are worth. Do 
they also spend a few moments each day try- 
ing to remember the feeling of Sun on their 
skin? Are they about to lose their mind? 

He continues his work on the intersec- 
tion, sweeping and scanning, scanning and 
sweeping. The next time he looks, the other 
person has disappeared behind a turn in 
their tunnel. 

Bart has finished an entire section of wall 
when the alarms go off, ear-piercing. The 
ceiling lights switch to the highest setting, 
bright, almost blinding. Bart puts down 
his scanner, sealer and sweeper, and heads 


[: cold and sunless in the 
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AJDENIA 


Let there be sunlight. 


towards the centre of the intersection, as he’s 
supposed to. He’s almost there when a girl 
comes running out of tunnel number 7 and 
bumps into him, nearly throwing him off 
balance. He grabs her arm without thinking, 
steadies her. Her uniform is torn. She’s not 
wearing a mouth cover. Bart can see her eyes 
behind her goggles. He would expect them 
to look frightened, but they are not. There is 
something else in there. Something bright. 
It makes Bart think of the Sun. It makes him 
think of Ajdenia. 

“What are you doing?” he asks. “You're not 
supposed to be here” 

She brings a finger to her lips. “We can 
live under the Sun, you, me, all of us,” she 
whispers. “They are lying.” And then she lets 
go of him and she’s off, running into tunnel 
number 5. 

Bart thinks of following her, but he knows 
he’s not supposed to. He's supposed to stand 
in the middle of the intersection and wait 
for the alarms to go silent, for the lights to 
go back to normal. So that’s what he does. 

Soon, a pair of 


> NATURE.COM guards come out of 
Follow Futures: tunnel number 7, hel- 
© @NatureFutures mets shiny and batons 
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“Which way did she go?” 
one of them asks. 

“Who?” Bart blurts out, and 
immediately receives a blow to 
the ribs from the guard's baton. 

“Your cooperation will be 
rewarded,” the other guard 
says. “Two more minutes 
under the filter will be added to 
your next payment, if the infor- 
mation you provide proves 
correct.” His tone implies that 
something will be taken away 
if not, but the exact nature of it 
is left vague. 

Two whole minutes of light, 
Bart thinks. Two whole minutes 
of Sun. 

As if noticing his hesitation, 
the guard who struck him 
scans Bart’s forehead, proving 
they'll keep their word, one 
way or another. “Come on,” he 
says. “Spit it out!” 

Could it be true? 

Bart raises his hand and 
points towards tunnel num- 
ber 5. 


Bart is in the filter room, waiting for his 
payment. He's thought of the girl in the Ra 
intersection often, ever since the day he 
found himself in her way. He’s thought about 
what might have happened to her, and about 
her words. Could they really live under the 
Sun, without slaving away in the tunnels in 
exchange for a few moments under the pro- 
tection of the filters? She was probably one of 
those who lost their minds in the tunnels, he 
assured himself in the end, one of those who 
didn’t know how to deal with the cold and the 
repetition, how to make themselves worthier 
than they are. Bart shuffles in his chair. And 
if not... but as the thought crosses his mind, 
the time finally comes, and the lid of the 
filter room opens, letting in the Sun. Bart 
unzips his side pocket and brings out the 
plastic pot with the pink flower growing in 
it. He raises it towards the light. “Drink up, 
Ajdenia,’ he whispers. 

He watches the pink petals shine against 
the Sun until the lid comes on again. m 


Natalia Theodoridou is a cultural-studies 
scholar and a writer of speculative fiction. 
Her writing has appeared in Clarkesworld, 
Daily SF, Escape Pod and elsewhere. Find 
out more at www.natalia-theodoridou.com. 
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advice on how to improve cognitive health and boost 
brain power. Yet anyone curious enough to dip into the 
scientific literature will find a complicated picture behind 
the claims. There are seemingly contradictory studies, and 
gaps in our knowledge. This Outlook touches on a number of 
areas concerning how researchers are working to preserve or 
enhance this most human of faculties. 

It is an unavoidable fact of life that cognition worsens 
as we age. For most people, this decline is imperceptible 
and gradual, whereas for others it is more rapid. Several 
approaches could prevent or even reverse the decline: 
common drugs, including an asthma medication (S4), and 
interventions to increase social interaction (S14) are both 
showing promise. 

Some researchers go further and think that it is possible 
to gain cognitive benefit in youth. But claims about brain 
training are controversial: results vary widely and even 
meta-analyses come to contradictory conclusions about 
effectiveness (S10). 

If brain training doesn’t work, at least all that’s lost is time. 
Other approaches carry real risks. Transcranial direct-current 
stimulation aims to activate the brain by applying a burst of 
electricity, and its simplicity has spawned a do-it-yourself 
subculture. Although the technique has therapeutic potential, 
it is far from clear how it works — and zapping the brain is 
not to be taken lightly (S6). Similarly, the rise in the use of 
smart drugs has many people worried (S2). But there could 
be one safe way to improve cognitive performance: let our 
technology be a benefit instead ofa distraction (S9). 

We are pleased to acknowledge the financial support of 
Nestlé Research in producing this Outlook. As always, Nature 
retains sole responsibility for all editorial content. 
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Organizers of the ESL One Cologne competition tested video gamers for smart drugs for the first time last year — all the tests came back clean. 


SMART DRUGS 


A dose of intelligence 


As mind sports becomes the new frontier for doping concerns, research is exploring 
whether users really get any value from ‘smart drugs’. 


BY AMBER DANCE 


gamers from around the world gathered 

for the ESL One Cologne competition in 
Germany. With US$250,000 in prize money up 
for grabs, pressure was high, and competition 
organizer ESL wanted to ensure fair play. At 
some point during the two-day event, a ran- 
dom selection of players received a tap on the 
shoulder and were escorted to a discreet back 
room where a physician awaited. 

For the first time in its 16-year history, ESL 
was taking saliva samples on its lookout for 
dope. Smart drugs were allegedly circulating, 
helping players to get in the zone. “Just like in 
normal sports, it’s not OK to win because you 
took a pill” says Anna Rozwandowicz, ESLs 
director of communications and the first head 
of its anti-doping initiative. That weekend, all 
the tests came back clean. 

E-athletes aren't the only ones allegedly pop- 
ping pills to try to enhance their mental facul- 
ties. Use of the drugs seems to be common, 
although finding firm data is not easy. In 2014, 
a survey of British and Irish students reported 
that more than 3% currently used prescription 
medications as cognitive enhancers’. A 2013 
survey of surgeons found that nearly 20% had 
used medication for cognitive enhancement 
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at least once’. And an informal survey 
(go.nature.com/xmlrn2) from 2008 reported 
that a similar proportion of Nature readers had 
used medications off-label to improve memory 
or concentration. 

Many smart drugs are prescription medica- 
tions either purchased illegally or used off-label. 
Top choices include Adderall (amphetamine) 
and Ritalin (methylphenidate) — treatments 
for attention-deficit hyperactivity disorder 
(ADHD) — and modafinil, which is a medica- 
tion for sleep disorders such as narcolepsy. In 
people with ADHD or sleep disorders, these 
drugs can raise brain function so that it matches 
that of healthy people. Butit is not clear whether 
the same medications can push a neurologically 
healthy, well-rested individual onto a higher 
cognitive plane. There is also the question of 
side effects. Despite these uncertainties, the 
apparently widespread use of neuroenhancers 
has prompted an ethical debate about whether 
their use is fair in school exams or mental 
games. 


LIKE A BOSS 

It is hard to say just how much these medica- 
tions help an average person. Amphetamines 
improve focus and can make dull tasks seem 
interesting. So they might change a student's 
perspective from, ‘Ugh, chemistry’ to ‘Ooh! 
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Carbon bonds!’ — even though that student 
is not any brighter. “They don't really live up 
to the name smart pills,” says Martha Farah, 
a cognitive neuroscientist at the University of 
Pennsylvania in Philadelphia. “Nothing that 
would turn you from a B to an A student or 
suddenly give you winning business ideas.” 

It’s still not clear precisely how these drugs 
produce their effects. Adderall and Ritalin 
are the best understood. Their main effects 
seem to relate to the neurotransmitters 
noradrenaline and dopamine, each of which 
mediates several effects, including attention 
and reward. Normally, a neuron releases 
these neurotransmitters as a message, tell- 
ing other neurons to fire or stay quiet. Once 
the signal has been received, the first neu- 
ron re-absorbs the neurotransmitters. These 
medications block that re-uptake, so that the 
signals persist. Amphetamines also have other 
actions, such as preventing the breakdown of 
neurotransmitters. 

Understanding of the cognitive-enhancement 
mechanism of modafinil is more sketchy. 
The drug affects “pretty much every major 
neurotransmitter in the brain’, says Ruairidh 
Battleday, a neuroscientist at the University of 
California, Berkeley. These include dopamine 
and noradrenaline, so part of its effect is prob- 
ably similar to that of Adderall and Ritalin. 
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Extra neurotransmitters help parts of 
the brain to communicate better, par- 
ticularly the prefrontal cortex, which 
neuroscientist Kimberly Urban calls 
the brain’s “boss”. When noradrenaline 
and dopamine are present in the right 
amounts, the boss is an effective man- 
ager, explains Urban, who works at the 
Children’s Hospital of Philadelphia. 
Too few neurotransmitters, and the 
boss is sluggish; too many, and it gets 
overwhelmed. The goal of treatments 
for ADHD and narcolepsy is to get the 
boss to the peak of function. Healthy 
users hope that they can raise their 
boss's peak. 

People seeking a chemical boost do have 
legal options. “By far the most commonly used 
neurocognitive enhancers are nicotine and caf- 
feine;’ says Peter Morgan, a psychiatrist at Yale 
University School of Medicine in New Haven, 
Connecticut. Instead of blocking reuptake, caf- 
feine stimulates the release of extra noradrena- 
line and dopamine; its effects aren't as strong or 
as long-lasting as those provided by the other 
drugs. Nicotine mimics the neurotransmitter 
acetylcholine, which affects learning and mem- 
ory. And no one bans people from pumping up 
their brains by smoking or drinking coffee dur- 
ing competitions or before an exam. 


UNCLEAR BENEFITS 
Researchers are attempting to quantify the 
effects of prescription neuroenhancers in 
healthy people. In one study, Stefano Sensi, 
a neurologist at G. dAnnunzio University of 
Chieti-Pescara in Italy, and his team asked 
26 university students to take an intelligence 
test. They then gave each volunteer either a dose 
of modafinil or a placebo before re-administer- 
ing the test®. For the test, called Raven's matri- 
ces, participants were required to complete the 
ninth pattern in a 3 x 3 geometric puzzle. They 
were scored on the number of grids that they 
answered correctly. Solving the puzzle requires 
quick and flexible thinking 
—called fluidintelligence. “They don’t 
Results were mixedand really live up 
depended on the difficulty tothe name 
of the matrices. Modafinil smart pills” 
made no difference on the 
easiest or the hardest puzzles. The drug did 
increase scores for the grids of medium diffi- 
culty, mostly for those who scored low in the 
pre-drug test; it made little difference to partici- 
pants who nailed the matrices on their first try. 
Sensi’s work was among 24 papers included 
in a 2015 review of modafinil in healthy peo- 
ple*. The studies used a variety of cognitive 
tests, and the review found that, on average, 
modafinil did seem to help — particularly 
with decision-making, planning and fluid 
intelligence. The more complex the task, the 
more that modafinil helped. “On the basis 
of the evidence,” says Battleday, “modafinil 
is improving people’s performance.’ But the 


Competition at the ESL tournament was intense. 


results were not uniformly positive. Not every 
test showed benefits and, in a couple, the drug 
seems to have stunted creativity. 

The review authors also noted that many 
cognitive tests had been designed to assess 
impairment, not enhancement. For exam- 
ple, people with a brain injury or demen- 
tia may struggle with a clock-drawing test, 
but someone with normal cognition will 
usually get it right — leaving no room for 
smart drugs to assist. Psychologists have few 
options to adequately measure cognition in 
healthy people, says review co-author Anna- 
Katharine Brem, a neuropsychologist at the 
University of Oxford, UK. 

As with any mind-altering drug (caffeine 
and nicotine included) addiction or depend- 
ence are concerns. People who take drugs for 
ADHD do not seem to get hooked, says James 
McGough, a child and adolescent psychiatrist 
at the University of California, Los Angeles. 
However, he does not know if the same drugs 
might prove addictive in healthy people. 
After all, Adderall is an amphetamine, which 
has established addictive properties. Ritalin 
and modafinil seem to be less addictive, says 
Urban, but that does not mean that regular use 
is without risk. Morgan points out that regular 
use of coffee and cigarettes causes consumers’ 
brains to adapt so that they need the stimu- 
lant just to function at their normal cognitive 
level. He suspects, the same might occur with 
smart drugs, even if users lack the compulsive 
craving that characterizes addiction. 

As for long-term effects, nobody knows. 
Urban and colleagues’ experiments in rats 
indicate that Ritalin could be bad for develop- 
ing brains’. The researchers treated both adults 
and juveniles with one milligram per kilogram 
body weight, which is within the normal range 
for human treatment. In the grown-up rats, the 
drug increased nerve firing in their prefrontal 
cortex. But in the 15-day-old rats, equivalent to 
a preteen human, firing went down. If Urban 
stopped the drug, the effects went away. But 
when she tripled the dosage — equivalent toa 
high, but not unheard of, human prescription 
— the firing rates stayed low even 70 days after 
the treatment stopped. 

The neurotransmitters that the medications 
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are known to target, noradrenaline 
and dopamine, are crucial regulators 
of brain maturation during puberty, 
Urban explains. Although the drugs 
don’t seem to cause problems in teen- 
agers with ADHD, they might throw 
off development ofa healthy brain. She 
speculates that the poor firing patterns 
observed in the rats might translate to 
problems with working memory and 
flexible thinking in people. For exam- 
ple, someone might have a hard time 
finding a new route to work if their 
usual path is blocked. Indeed, she says, 
healthy children who take too much 
Ritalin can exhibit “extreme persever- 
ance”— for example, being unable to pause a 
video game when it’s time for dinner, persist- 
ing with one topic of conversation without 
being able to switch gears, or feeling emotions 
such as anger for a longer time than a situation 
warrants. 


SMART MORAL COMPASS 
Smart drugs are still primitive, Sensi says. They 
temporarily alter multiple neurotransmitters, 
so they aren't very specific. A better approach, 
he suggests, could be to develop drugs that 
would promote nerve-cell growth or the 
rewiring of the brain, inducing changes that 
would permanently enhance thinking. 
However, the current medications are still 
potent enough to raise ethical questions. One 
such concern revolves around social equality. 
Not everyone has equal access to smart drugs, 
and there is a danger that only the privileged 
will be able to get ahead with amped-up cog- 
nitive powers. The result would be yet another 
force widening the gap between the haves and 
the have-nots, says Nita Farahany, a bioethi- 
cist at Duke University in Durham, North 
Carolina. Or, there might bea sort ofarms race, 
with people taking ever more advanced smart 
drugs just to keep up. At her own institution, 
Farahany says, students were so worried about 
brain-boosters that the university amended its 
honour code in 2011 to state that “the unauthor- 
ized use of prescription medication to enhance 
academic performance” was a form of cheating. 
For now, at least, even the scientists who 
study smart drugs aren't relying on them. 
Most of those interviewed for this article said 
that they stick to coffee, tea or energy drinks. 
Morgan, for his part, suggested that the same 
cognitive benefits can be achieved by simply 
taking a refreshing nap. m 


Amber Dance is a freelance science writer 
based in Los Angeles, California. 
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Restoration project 


Future generations may have less to fear from cognitive 
decline thanks to microscopic insights into the ageing brain, 
and interventions from unexpected quarters. 


BY ANNABEL MCGILVRAY 


geing rodents have had a good few 
A= In the United States, neurologist 

‘Tony Wyss-Coray made old mice think 
that they were young again by giving them 
plasma donated by students. Thousands of 
miles away in Austria, neuroscientist Ludwig 
Aigner achieved a similar feat in elderly rats 
using a common asthma medication. 

These are preliminary results, but both 
researchers are attempting to rewrite the story 
on cognitive decline. “It’s important to think 
about the aged brain differently from how we 
used to think about it,” says Aigner, who is 
head of the Institute of Molecular Regenera- 
tive Medicine at Paracelsus Medical University 
in Salzburg. Rejuvenation might be, as he says, 
“a sexy term’, but it is one that researchers such 
as Aigner and Wyss-Coray are using seriously 
in relation to preventing cognitive loss in the 
healthy ageing brain. Ageing does not have to 
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bea one-way, downhill street. But reversing the 
direction requires a better understanding of why 
the brain begins to decline in the first place. 


KEEPING NUMBERS UP 

In cerebral terms, ageing starts early, long before 
symptoms manifest. The typical human brain 
begins to shrink at about age 20; by the time it is 
100 years old, the brain has lost 20% ofits mass. 
And this is for a healthy brain. Those affected 
by neurodegenerative disorders such as Alzhei- 
mer’s disease dwindle even more. 

Ageing involves the gradual deterioration 
of the myelin sheathing that surrounds some 
nerves, and which, alongside glial cells, com- 
prises the brain’s white matter. There is some 
loss of neurons, but this — in a revision of the 
traditional dogma — is not the main driver 
of the decline. Instead, neurons have reduced 
function, and the connections between them 
are weakened. 

Decline isn’t uniform in the brain or across 
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cognitive tasks. Particularly hard hit are the 
hippocampus, which is crucial for memory 
formation and short- to mid-term storage, 
and the prefrontal cortex, which manages 
decision making and planning. This means 
that when deterioration occurs, memory of 
recent events and facts fades quicker than 
items that have been stored for a long time, 
such as words and numbers. 

“There are a myriad of changes compared 
to the young brain that cross every level, every 
cell type,” says Wyss-Coray, who is at Stanford 
University in California. In mice, he says, the 
changes begin at 6 months, which is equivalent 
to a 30-year-old human. By the time mice reach 
an age that corresponds to the 50s and 60s, he 
says, “there are already striking changes at the 
cellular level in the brain’ These include modifi- 
cations to the way DNA is expressed, epigenetic 
alterations that affect which genes are active and 
marked shifts in the communication between 
cells. Furthermore, metabolism declines in both 
white and grey matter as the cell’s powerhouses, 
mitochondria, start to fail. 

At Mount Sinai School of Medicine in New 
York, neuroscientist Mark Baxter and other 
researchers at the institute are exploring the 
impact of ageing on synapses — the connec- 
tions between neurons. His colleagues used 
electron microscopy to examine changes in the 
synapses of the hippocampal and prefrontal 
cortex regions of ageing rhesus monkeys. The 
monkeys were given tasks that tested working 
memory, which is associated with the temporal 
lobe (where the hippocampus is located). The 
researchers used a delayed-response task in 
which the monkeys needed to remember one 
object fora period of time and then select a sec- 
ond non-matching object in order to receive a 
food reward’. The team found that monkeys 
that had more problems on the memory task 
tended to have more axon terminals (the ends 
of neurons which connect to other neurons) 
with a single or no synapse. “The underlying 
synaptic structure of those neurons is being 
degraded,” says Baxter. 

At the other ends of neurons are dendrites, 
which connect to axon terminals of other 
neurons by dendritic spines. Neurons in the 
prefrontal cortex have three different types of 
spines: stubby, mushroom-shaped and thin. 
The latter emerge from, and retract into, the 
dendrites as required, giving the prefron- 
tal cortex great plasticity. As the brain ages, 
synapses in this region become dominated 
by mushroom-shaped connections, which 
are associated with long-term memory for- 
mation, but not with mental flexibility (see 
“Thinning out’). Similar to the altered synapses 
that Baxter’s colleagues saw in the monkey’s 
temporal lobes, ageing monkeys with declin- 
ing working memory have an altered synaptic 
environment in the prefrontal cortex. “The 
thin spines are lost,’ says Baxter. 

These structural changes do not rule 
out therapeutics to treat cognitive decline, 
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according to Baxter. “The basic elements of 
neural computation, the neurons, are still 
there to work with,’ he says. One possible way 
to preserve both the thin dendritic spines in 
the prefrontal cortex and the multisynapse 
connections in the hippocampus is to use 
oestrogen — in women at least. In older female 
rhesus monkeys, treatment with oestrogen 
increased the density of the thin spines and 
correlated with enhanced performance in 
delayed-response tasks’. The researchers 
experimented with a continuous delivery sys- 
tem and one that mimics the menstrual cycle. 
So far, they have found that the latter works 
best in female monkeys, “if you take oestrogen, 
you want to take it cyclically and not continu- 
ously,’ says Baxter. 


ADOSE OF YOUTH 

Wyss-Coray began his work with mice and 
plasma by studying the effects of parabiosis — 
the joining of two animals so that they share 
blood — on the ageing brain. He and his col- 
leagues showed that when older mice share 
circulation with younger mice, the brains 
of the older mice show dramatic changes’. 
Neuron growth is boosted, and there are 
improvements in the animals’ ability to learn 
and speed of recall. 

The same effect is seen when plasma from 
young volunteers is transfused into old mice. 
Once every 3 days for 3 weeks, Wyss-Coray’s 
team gave old mice with cognitive impair- 
ment 15 microlitres of plasma (the equivalent 
of one unit — around 200 ml — for humans) 
donated by students. The treated mice were able 
to remember the location of one hole among 
many that would allow them to shelter from a 
strong light shining on the testing platform. The 
untreated mice could only find the hole through 
trial and error. Is it rejuvenation? “Rejuvenation 
in mice, definitely,’ Wyss-Coray says. 

Wyss-Coray has now begun a clinical trial 
of the use of plasma from young donors to 
treat patients with Alzheimer’s disease. The 
double-blind trial was originally scheduled 
for completion in October 2015, but it is pro- 
gressing slowly. There is no lack of willing 
participants, he says, but recruitment has been 
difficult owing to participation criteria that 
exclude certain common medical conditions 
and medications. Over the course of a month, 
half the trial participants receive weekly infu- 
sions of one unit of plasma from young (under 
30 years) male donors; the remainder receive 
saline solution. The participants are then 
tested for memory, language, spatial orienta- 
tion and visual attention. 

In the meantime, work is continuing to 
determine what molecules underpin the 
parabiosis effect. Wyss-Coray suspects that 
the answer may come from outside the brain. 
He and his colleagues, including Aigner, have 
already identified eotaxin — commonly associ- 
ated with asthma — as one protein that is pre- 
sent in higher quantities in the plasma of older 
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Neurons from a young (left) and old non-human primate have a similar number of branch-like dendrites, 
but there are fewer spines on each dendrite of the older animal, with the thin spines notably absent. 


Young neurons 
are covered in 
many spines of 
different shapes 


animals*. This research prompted Aigner to 
look at other asthma signalling proteins — and 
led to the discovery of the rejuvenating powers 
of the common asthma drug montelukast. 


INFLAMMATORY WORK 

Aigner’s breakthrough attracted headlines 
around the world in 2015. His team showed 
that montelukast improved cognition and 
boosted neuron growth in old rats’. He stops 
short of saying that the drug completely 
restores function. “But yes, we can partially 
rejuvenate the brain” 

Montelukast blocks the inflammatory action 
that a class of proteins called leukotrienes can 
trigger in the lungs. These proteins also appear 
in the brain — in which, according to Aigner, 
the levels of an enzyme involved in the produc- 
tion of leukotrienes increases during ageing. 
Leukotrienes contribute to neural inflamma- 
tion, cell death and the unnecessary activation 
of the brain’s immune cells, microglia, which 
can damage healthy neurons. 

By giving rats with age-related cognitive 
decline montelukast over a six-week period, 
Aigner’s team stopped the inflammatory 
action of the leukotrienes and improved the 
animals’ cognitive function. Further testing 
in rats with dementia suggests that the drug 
causes a complete reversal in disease-related 
cognitive decline. Aigner is hoping to secure 
funding for a clinical trial of montelukast in 
patients with Parkinson's disease. 

It is no coincidence that Aigner and Wyss- 
Coray’s potential interventions against the cog- 
nitive decline of the healthy brain are similar to 
the possible treatments of neurodegenerative 
disorders such as Alzheimer’s and Parkinsons. 
The ageing process in a healthy brain resem- 
bles the early stages of pathological deteriora- 
tion in these types of dementia. Aigner points 
out that many of the microscopic markers of 
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neurodegenerative disease — synaptic decline, 
reduced neurogenesis, inflammation and even 
the notorious amyloid plaques associated with 
Alzheimer’s — are also present in the ageing 
brain ofa healthy person. 

Wyss-Coray agrees that current knowledge 
points to many microscopic similarities 
between pathological and non-pathological 
ageing of the brain. At the cellular level, “it’s 
very hard to discriminate normal healthy 
ageing from disease.” He speculates that if 
everybody were to live to 100 years old, most 
people would develop dementia. “If you look 
at a normal healthy control, most of those are 
on their way to getting a neurodegenerative 
disease,” he says. 

The proportion of the world’s population 
aged over 60 is forecast to nearly double to 22% 
in 2050. Cognitive impairment is already one 
of the leading causes of admission to residen- 
tial care; without some form of prevention or 
rejuvenation, caring for older populations will 
become much more ofa burden. 

Neurodegenerative diseases remain the 
priority, says Baxter, but against a background 
of an ageing population, neuroscientists recog- 
nize the importance of keeping as many people 
as possible mentally fit. “We want everyone to 
have excellent cognition as they get older, so 
they have a good quality of life,” he says. “And 
they’re not struggling to remember whether 
they turned their stove off, or if they took their 
high-blood-pressure pill that morning.” m 


Annabel McGilvray is a freelance writer 
based in Sydney, Australia. 
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Neurobiologist Flavio Frohlich receives transcranial direct-current stimulation. 


NEUROSTIMULATION 


Bright sparks 


As neuroscientists explore the therapeutic prospects of 
brain stimulation, the amateur community are hoping the 
technology will enhance their mental faculties or well-being. 


BY KATHERINE BOURZAC 


incoln walks into the neurohacker 
Le Iam attending ina garage in San 

Francisco. He takes off his flat-brimmed 
baseball cap, exposing two small, red burns on 
the side of his face. The day before, he tried a 
brain-stimulation method called transcranial 
direct-current stimulation (tDCS) for the first 
time. “It was pretty intense,” he says. 

Lincoln had one electrode on his right 
temple, and another at the bottom of his left 
deltoid. When turned on, 2.5 milliamps of cur- 
rent flowed into his brain, through his medial 
prefrontal cortex, down to his shoulder. Atleast, 
that’s what this set-up was intended to do, he 
says. The idea was that the stimulation would 
help Lincoln, a software programmer who has 
meditated regularly for several years, to achieve 
a mystic state. That didn’t happen, although he 
says he did feel a heightened awareness. After 
40 minutes of stimulation, he also experienced 
twitching in his legs. 
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Neuroscientists who have been studying 
the use of low-intensity electrical current to 
stimulate the brain have produced tantalizing 
results that have, not surprisingly, encouraged 
amateur use. They have shown boosts in learn- 
ing, memory and performance on mathematical 
tests, as well as early success in treating depres- 
sion and helping the recovery of those who have 
had a stroke. Brain stimulation is easy to do at 
home, either by building a tDCS set-up using 
some simple wiring and a battery, or buying one 
ready-to-use from any of the ten or so compa- 
nies selling them online. Some users are seeking 
cognitive enhancement, whether it’s to achieve 
mindfulness or a memory boost; others are try- 
ing to treat mental illnesses such as depression. 

If stimulation is easy, neuroscientists warn, 
doing it right is not. Companies selling these 
devices direct to consumers are “smartly 
circumventing government regulation” in 
the same way that the supplement industry 
is, says Flavio Frohlich, a neurobiologist at 
the University of North Carolina School of 
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Medicine in Chapel Hill. “People may well be 
damaging their brains.” 

There is a disconnect between carefully 
done, quantitative research on tDCS, and more 
exploratory use at home or even, research- 
ers say, in the laboratories of less-experienced 
scientists. Improper use and some ambiguous 
meta-analyses — as well as evidence of harm 
or negative results — have fed into a backlash 
against the technology within the neuroscience 
community. Brain-stimulation specialists are 
calling for a more nuanced understanding of the 
technology and its uses. Researchers are excited 
about the possibilities of brain stimulation for 
cognitive enhancement and therapy. They just 
want to take their time to validate it. 


BIOELECTRIC BOOST 

Electric stimulation has come in and out of 
fashion since the eighteenth century, when 
Italian physician Luigi Galvani famously 
made frog’s legs jump with an electric cur- 
rent, and naturalist Alexander Von Humboldt 
stuck wires in his back with the aim of 
understanding the excitability of nerves and 
muscles. In the nineteenth and twentieth cen- 
turies, physicians administered shock therapy 
to patients, inspiring fictional characters from 
the monster in Frankenstein to the horrifying 
Nurse Ratched in One Flew Over the Cuckoo's 
Nest. Today’s researchers have tamed both 
the method and, they hope, the image of 
electrical stimulation. 

A gentler version was popularized by research 
at the University of Géttingen in Germany 
led by neurophysiologists Walter Paulus and 
Michael Nitsche, who began experimenting 
with low levels of electrical brain stimula- 
tion in 1999. Although a typical dose of 
electroconvulsive therapy (which is used spar- 
ingly to treat depression) might approach 1 
amp, the tDCS revived by the Géttingen group 
uses a tiny fraction of that — typically only 1 to 
2 milliamps. That is low enough to be done with 
a standard 9-volt battery. 

This weak stimulation cannot directly make 
neurons fire — instead, it generates a diffuse 
electrical current that changes their membrane 
potential. Neurons under the anode, the nega- 
tive electrode through which the electrons flow, 
become more likely to fire when they receive 
signals from other neurons. Neurons under 
the cathode, the positive electrode, become 
less likely to fire. It is very difficult to target a 
specific region of the brain, especially with 
simple home set-ups that use wet sponges as 
the contact points. 

In their early work, Paulus and Nitsche 
used tDCS mainly to study motor learning 
and working memory. But soon, many other 
researchers began exploring its potential for 
cognitive enhancement. They reported that 
brain stimulation acts a bit like caffeine, and 
may help people to learn faster. “It seems to give 
you any kind of benefit you want, says Frohlich. 

Such cheerful pronouncements earned tDCS 
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a label of ‘too good to be true’ And indeed, the 
technique has now reached the backlash point 
in its hype cycle. Many of the positive tDCS 
studies have been criticized for a lack of rigour; 
more carefully controlled experiments are start- 
ing to show negative results. Last year, Frohlich 
and his colleagues wrote a report suggesting 
that stimulation can actually be detrimental to 
IQ scores. His team gave a standard IQ test to 
40 people, who then received either sham or real 
tDCS of 2 milliamps for 20 minutes over the left 
or right prefrontal cortex, or both sides. When 
people took the IQ test again, everyone's scores 
were higher (because of the well-known retest 
effect), but those who were stimulated actually 
had a smaller increase than the placebo group’. 
The subpar performance came in a particular 
part of the test that assessed fluid intelligence 
— the ability to solve new problems on the fly. 

A few meta-analyses of tDCS studies have 
brought the entire field into question — and 
that includes lab-based research as well as the 
amateur use. One analysis by a group at the 
University of Melbourne, Australia, concluded 
that tDCS had “little-to-no reliable” effects”. 
The authors, whose analysis has been sharply 
criticized by the brain-stimulation community, 
declined to speak to Nature for this feature. 

Nitsche and other prominent brain-stimu- 
lation specialists say that the methods used in 
this — and other — meta-analyses have been 
poor. In particular, they contend that it does not 
make sense to pool the results from studies that 
used different experimental set-ups and equip- 
ment, even if they looked at similar cognitive 
tasks. Marom Bikson, a bioengineer at the City 
University of New York, says that it would be 
like doing a meta-analysis of clinical trials for 
two drugs, only one of which works. The posi- 
tive and negative results would cancel each other 
out, but it would be absurd to then conclude that 
neither drug works. What is important is not to 
average out the results from different electrical 
stimulation set-ups in a meta-analysis, but to do 
work that is reproducible, he says. 

Brain stimulation is complicated, says 
Bikson. Frohlich’s IQ-deficit finding, for exam- 
ple, shows that there may be off-target effects 
that researchers miss. And the poor spatial 
resolution of tDCS means that researchers 
should design experiments carefully to make 
sure that they are definitely targeting the part 
of the brain they’re interested in, says Frohlich. 
Thus, even when a study leads to positive 
results, researchers may misinterpret the 
outcome unless they have carefully validated 
which area of the brain they are stimulating. 


STIMULATING MISUNDERSTANDINGS 

In the scientific community, Frohlich’s work 
is respected by both tDCS proponents and 
doubters. The do-it-yourself community, 
however, seems to have adopted a more 
defensive response. One of the most popular 
blogs, DIY tDCS, pointed to the negative 
coverage under the headline “Why your 
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BOYS’ OWN BRAIN BUZZ 


Although the over-60s are represented, young men are the most frequent users of transcranial direct-current 
stimulation kits at home°. Of 121 users, most were looking for cognitive enhancement, but some were treating 
conditions such as attention-deficit hyperactivity disorder (ADHD) and obsessive-compulsive disorder (OCD). 
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brain stimulator is probably not making 
you stupider”, pointing out the differences 
between the set-ups for home users and those 
used in Frohlich’s study. Whether that is the 
case or not, unavoidable inconsistencies 
in the use of the home devices — from the 
intensity of the current to the placement of 
the electrodes — are troubling for researchers 
and brain hackers alike. 

However, warnings about tDCS seem to 
be trickling down, at least in the San Fran- 
cisco hacker community. At a weekly meet- 
up of the local branch 


of the international “People are 
NeurotechX hacker desperate, they 
organization, of which re driven to 
Lincoln isa member, these things 
Iam chatting with six out of alack of 


programmers gath- effective tools.” 
ered around folding 
tables and couches. They are talking about a 
3D printed electroencephalogram (EEG) cap 
that is available online, and working on open- 
source software for brain-computer interfaces. 
Computer-science graduate student 
Damien talks about the excitement of 
“exploring your own brain” using EEG feed- 
back. But stimulation? No way, say most 
of them. “It hasn't been proven that it’s 
harmless,” says software engineer Marion, 
who hosts the meet-up. This group takes a 
read-only approach, recording the brain’s 
electric signals, but not stimulating them. 
When Lincoln walks in with the burns on 
his face, I am not the only one to raise an 
eyebrow — the side effects he noticed and 
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the uncertainty about what really happened 
in his brain reinforces their scepticism. 
Lincoln says he wouldn't use that particular 
set-up again, especially not with the same 
company’s tDCS device, but he might try 
another set-up. 

It’s difficult to know what people are 
actually doing at home, and how many home 
users there are. A tDCS forum on the social- 
news website Reddit (known as a subreddit) 
had around 8,000 members at the beginning 
of 2016, but not all readers are necessarily 
users of the technology. Posts include tips on 
electrode placement, links to media coverage 
of scientific results and some alarming ques- 
tions, such as one from a reader who asked 
whether childhood epilepsy makes tDCS 
more risky as an adult. 

So far, Anita Jwa at Stanford Law School 
in California, whose research focuses on 
the intersection between law and neurosci- 
ence, is the only researcher who has studied 
home users’. Jwa says that tDCS users do not 
meet up in person very much, as far as she 
knows, and that the online community has a 
self-regulating aspect; the reader who asked 
about epilepsy, for example, was told to ask 
his doctor or simply not to take the risk. On 
the basis of surveys posted on two popular 
websites: the tDCS subreddit and DIY tDCS, 
Jwa found that users were mostly in their 20s 
and 30s and 94% were male. 

Home tDCS users, says Jwa, tend not to 
see themselves as experimentalists who are 
adding to the pool of knowledge. “Most users 
are doing it for cognitive enhancement,” 
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she says (see ‘Boys’ own brain buzz’). Most 
people who sought cognitive enhancement 
wanted to boost attention, learning or work- 
ing memory. 


SPEAKING THE BRAIN’S LANGUAGE 

While repeating the mantra ‘don’t try this at 
home; neuroscientists admit that people treat- 
ing themselves for illnesses such as depression 
are trying to make up for the real shortcom- 
ings of mainstream medicine. “People are 
desperate, they are driven to these things 
out of a lack of effective tools,” says Adam 
Gazzaley, a neuroscientist and psychiatrist at 
the University of California, San Francisco. 
Gazzaley is working on combining brain 
stimulation with brain-training video games 
for cognitive enhancement. 

What's especially frustrating for neurosci- 
entists is that brain stimulation has real prom- 
ise for treating conditions and for cognitive 
enhancement — the very things that companies 
are implying their machines do (while taking 
pains to avoid making actual medical claims, 
which would trigger government regulation). 
Ultimately, validated devices and stimulation 
procedures will replace what’s available today. 
When researchers come up with something bet- 
ter, “the snake oil will go away’, says Gazzaley. 

To get there, some researchers are seeking a 
better understanding of the mechanisms behind 
tDCS. “We need to find out how it works so we 
can make it better,” says Frohlich. Bikson disa- 
grees — the technique is too promising to stop 
and wait for a full mechanism, he contends. 

For those with a biophysics bent, the mys- 
tery about the mechanisms is a great motiva- 
tion. Cognitive scientists Ludovica Labruna and 
Richard Ivry, of the University of California, 
Berkeley, fall into this camp. They hope to use a 
better-understood brain-stimulation method to 
try and illuminate the workings of tDCS. 

Transcranial magnetic stimulation (TMS) 
uses a focused magnetic field to cause small 
numbers of neurons to fire. TMS’s direct effect 
is much easier to measure in humans than 
tDCS’s more mysterious influence. Because of 
this, researchers know that all kinds of things 
can influence a person's sensitivity to TMS, 
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L: The burns Lincoln received from a DIY kit. R: Others have more success using a different set up. 


including skin conductivity, skull thickness and 

even subtle differences in brain anatomy. 
Labruna shows me how TMS sensitivity 
can be quantified. She tapes an electrode near 
the web of skin between my index finger and 
thumb. This electrode will measure the volt- 
age in my hand when she stimulates my brain 
with a TMS paddle. Labruna locates the part 
of my motor cortex that controls this particu- 
lar muscle in the hand, 


“Weneedto  thencranks up the TMS. 
findouthow — The machine can focus 
it works so a magnetic field of about 
wecanmake  3teslainasmallarea ofthe 
it better.” cortex; TMS sensitivity is 


measured by determining 
what percentage of this maximum magnetic 
strength is required to provoke a threshold 
potential of 1 millivolt ina muscle. After a few 
tries, my hand jerks like a puppet. My sensitiv- 
ity is medium-high — whether it’s because my 
skull is thin, or something else, I respond when 
the field’s intensity is only 42% of its maximum 
(most people respond at about 50% maximum 
intensity; some very sensitive people respond at 
about 29%, others not until about 60%). 

In preliminary results presented at the 
Society for Neuroscience meeting in Octo- 
ber 2015, Labruna, Ivry and their colleagues 
showed that the more sensitive a person is to 
TMS, the more readily they respond to tDCS, 
and the better they do in a motor-learning task 
compared with both those who had a sham 
stimulation and those who were less sensitive 
to TMS. The researchers are now designing an 
experiment that will test whether this correla- 
tion holds in tDCS experiments that look at 
other types of cognition. 


ALTERNATE REALITY 
Although tDCS has been getting most of the 
buzz, there are other kinds of electric brain 
stimulation that may work better, or at least 
have different applications. Some research- 
ers, for example, are using alternating current, 
which targets brain oscillations, instead of direct 
current, on the basis of the theory that this may 
bea more natural way to stimulate the brain. 
Transcranial alternating current stimulation 
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(tACS) emerged in 2006, when researchers in 
Germany showed that stimulating the brains of 
healthy people at 0.75 hertz (at the lower end of 
the frequency range of delta waves) during the 
early stages of sleep encouraged this rhythm 
and enhanced memory retention". Other fre- 
quencies are associated with different cognitive 
states: theta (5-8 Hz) with working memory 
and gamma (>30 Hz) with memory mainte- 
nance, although these associations are broad 
and highly dependent on where in the brain 
the patterns are measured. 

When you close your eyes and relax, the 
brain's oscillations are about 10 Hz (within 
the alpha frequency of 7.5-12.5 Hz). Ina pre- 
liminary study, Frohlich and his colleagues 
measured a person's alpha frequency with an 
EEG, then applied an alternating current at 
a matching frequency. They found that this 
enhanced creativity”. 

Like tDCS, the mechanism of tACS is not 
clear. One theory is that it might be more 
targeted because the rhythmic simulations 
interact with existing brain activity only at a 
particular frequency. The effects of tACS are 
also thought to be more short-term than those 
of tDCS, says Roi Cohen Kadosh, a cognitive 
neuroscientist at the University of Oxford, UK. 
That still needs to be proved, but it’s an exciting 
hypothesis, says Frohlich. Whatever the kind 
of current, the brain isn’t a simple machine, he 
cautions, and it isn’t possible to turn a function 
on like a light switch. “We have to speak the 
language of the brain and understand how the 
brain responds,’ Frohlich says. 

Despite the backlash against tDCS, many 
neuroscientists have no doubt that transcra- 
nial electrical stimulation will come to be an 
important tool for cognitive health and well- 
being. “The brain uses both neurotransmitters 
and electric fields to communicate,’ says Sarah 
Hollingsworth Lisanby, director of the divi- 
sion of translational research at the National 
Institute of Mental Health in Bethesda, Mary- 
land. Therefore, she says, we should use both 
channels for therapy. Moreover, she adds, non- 
invasive stimulation may be a way to intervene 
earlier in the development of mental illness — 
perhaps even to prevent it. 

Bikson thinks in a similar way. The 
commonality between brain stimulation for 
cognitive enhancement and for therapy is that 
both involve learning. You are “trying to teach 
the brain’, he says, “either a new trick — or not 
to be sick” m 


Katherine Bourzac is a freelance journalist 
based in San Francisco, California. 
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performance. Whether it is to enhance attention and memory 

or to stave off the normal decline of ageing, the drive for cog- 
nitive enhancement has caught the imagination of scientists and the 
public. The media, in particular, regularly report strategies that offer 
‘limitless’ abilities. Despite the hype, drugs and devices produce, at 
best, only modest gains'. Indeed, the current evidence indicates that 
the best ways to improve mental ability are the familiar approaches 
of proper sleep, good nutrition and exercise. 

One reason may be that our brains are already working at 
near-optimal capacity; attempts to modify them bump up against 
the hard limits of neurobiology. If this is the case, then we need to 
find a new strategy for boosting cognitive power. The best path, we 
believe, will be to improve the way in which we blend our mental 
capabilities with the powerful algorithms in our computers and 
smartphones. 


Nee objectives are more desirable than improving mental 


PERSPECTIVE SHIFT 

It is hard to overstate the degree to which 
information technology has permeated 
today’s world. For many, reliance on tech- 
nology begins when they awaken, contin- 
ues throughout the day and ends only when 
they drift off to sleep. Nearly half of the 
adult population worldwide owns a smart- 
phone. But to call these devices phones is a 
misnomer; the functions they carry out are 
incredibly diverse, including storing infor- 
mation, sending and receiving messages, 
and videoconferencing. Sometimes it feels 
as if technology is supplanting thinking, but 
this worry is somewhat misdirected. Rather, 
these devices are extending the reach of our cognitive abilities, so 
much so that a statement that once seemed radical is increasingly 
realistic: we have become proto-cyborgs”. 

We are still a long way from the transhumanist fantasy of upgrad- 
ing our brains with implantable computer chips. But we have 
entered a transitional era in which we are commingling our cogni- 
tive space with technology. In doing so, we have enlisted the assis- 
tance of what might be termed technologies of the extended mind’. 

Consider the ways in which these tools already enhance cogni- 
tion. Never before have humans had the ability to find answers in 
the blink of an eye, and to store vast amounts of data with greater 
fidelity than biological memory. As the Internet of Things gains 
momentum, we will find ourselves interacting with ‘intelligent 
objects that predict our preferences and make decisions on our 
behalf. Ideally, delegation of these tasks to our devices would allow 
us to expend more energy pursuing challenging activities such as 
improving willpower and analytical thinking. 

But that is not how the human-technology connection is playing 
out. Instead, the same devices that extend some cognitive abilities 
degrade others. People grappling with information overload often 
bemoan the difficulty they have in concentrating for long periods 


WE HAVE ENTERED A 


TRANSITIONAL 


ERA IN WHICH WE ARE 
COMMINGLING OUR 


COGNITIVE 
SPACE 


WITH TECHNOLOGY. 


(@, lime to expand the mind 


Thoughtful use of ubiquitous technology can improve mental ability 
more than drugs and devices, say Nicholas S. Fitz and Peter B. Reiner. 


of time’. Although we seem to be constantly connected to others, 
some overuse technology at the expense of face-to-face social 
engagement. And it is the rare individual who does not succumb to 
the addictive appeal of devices that provide instant rewards. Turn- 
ing them off is impractical because it can leave us disconnected 
from information that is truly valuable. Our gnawing anxiety about 
overusing our technology is only partly soothed by the current crop 
of apps, such as Freedom or Moment, that attempt to quiet the noise 
of digital life by blocking distracting websites for a limited period. A 
better strategy, however, is to embed solutions in the design of both 
the devices and the environments in which we use them. 


THE ATTENTION ECONOMY 

One approach that merits further consideration is the idea of 
calm technology” — interfaces that inform without distracting. 
For example, your phone might automati- 
cally go silent when it knows that you are 
in a meeting, unless the message is urgent. 
Behavioural nudges that help people to 
more closely align their technology use with 
their intentions could reinforce embed- 
ded design principles. Such alterations of 
the digital and physical environments can 
foster a healthier relationship with our 
devices. We endorse fledgling movements 
such as Time Well Spent that call for a broad 
coalition to work towards real-world solu- 
tions. We encourage the scientific commu- 
nity to investigate the implications — both 
beneficial and deleterious — of technology 
adoption. We call on policymakers and 
behavioural scientists to consider crea- 
tive means of encouraging best practices by the public. And we 
challenge technology designers to improve the user experience so 
that cognitive health is no longer sacrificed on the altar of profit 
margins. 

Our growing reliance on ubiquitous computing technologies 
defines the modern age. Melding these technologies with thought- 
ful design and effective nudges that reinforce healthy social norms 
will ensure that they genuinely improve the human condition. m 


Nicholas S. Fitz is a graduate student at the National Core for 
Neuroethics at the University of British Columbia. Peter B. Reiner is 
a professor at the National Core for Neuroethics at the University of 
British Columbia. 

e-mail: peter.reiner@ubc.ca 
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Other” 


An advert for Pelmanism, a brain-training technique that became popular in the early twentieth century. 


BRAIN TRAINING 


Memory games 


Conflicting results are expected in a young field, but what do 
you do when even the meta-analyses do not agree? 


BY SIMON MAKIN 
CC ew Minds for Old in 12 
Weeks!” proclaimed adverts for 
Pelmanism, a brain-training 
technique that swept the United King- 
dom in the early part of the twentieth cen- 
tury. Promotional material claimed that 
this system could combat such troubling 
mental phenomena as “lack of ideas” and 
“brain fag”. It became so successful that the 
Pelman Institute established offices in Aus- 
tralia, South Africa and the United States. 
Pelmanism was still being promoted as late 
as the 1960s, but has since sunk into obscu- 
rity, becoming merely a curious chapter in 
the history of psychology. 

Except that almost 100 years later, brain 
training is back — this time with scientific 
backing. The first apparent breakthrough 
came in 2002 when a group of researchers in 
Sweden showed that training children with 
attention-deficit hyperactivity disorder on 
adaptive working-memory tasks — which 
test a person's ability to retain and manipu- 
late information over a short period of time 
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and that increase in difficulty to match 
performance — improved their attention 
and reasoning’. But the real excitement 
came in 2008 with research led by psycholo- 
gist Susanne Jaeggi, now at the University of 
California, Irvine. Jaeggi’s study seemed to 
show that healthy young adults who practised 
an adaptive working-memory task, which the 
authors called dual n-back (see ‘A test of sight 
and sound’), showed increases in the unre- 
lated ability of fluid intelligence (the ability 
to reason in novel situations)’. Furthermore, 
there was a dose effect: the more that people 
trained, the ‘smarter’ they became. 

And, as with Pelmanism, a lucrative industry 
has grown up around the idea that cognitive 
performance can be enhanced by training. 
But the claim that mere hours spent playing a 
memory game can increase intelligence is an 
extraordinary one, and sceptics soon started 
voicing objections. Negative studies appeared, 
and the field is now 
awash with conflict- 
ing results. When 
faced with a large but 
uncertain evidence 


> NATURE.COM 

To watch an animation of 
the dual n-back test visit: 
go.nature.com/dofbdm 
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base, researchers usually turn to meta-analysis 
to assess the evidence. Unfortunately, in this 
field, even meta-analyses are producing 
divergent conclusions. Some researchers have 
suggested that the inconsistencies stem from 
the use ofinadequate control groups and meas- 
ures of outcome. But most agree that the field 
needs bigger, better studies — anda return to 
basic science. 


THE TROUBLE WITH TRAINING 
Numerous studies purport to show benefits 
from cognitive training, but delve deeper 
and there is less than meets the eye. Many of 
these studies show little more than improved 
performance on tasks closely related to those 
that participants trained for — known as near 
transfer. “When you practise something, of 
course you get better at it? says Jaeggi. “The 
real question is whether there is far trans- 
fer’, referring to when training benefits dif- 
ferent cognitive abilities. This is why a lot of 
training research has focused on working 
memory. Working-memory capacity predicts 
everything from reading ability to academic 
achievement, and correlates highly with fluid 
intelligence. Increasing this capacity might, 
therefore, have a broad impact on cognition. 

Among the most prolific early sceptics of 
far-transfer effects were Randall Engle and his 
fellow psychologists at the Georgia Institute of 
Technology in Atlanta. They pointed to two 
main recurring problems in working-memory 
training studies, the first being inadequate con- 
trol groups. Many findings, they said, could be 
due to ‘Hawthorne effects; referring to the fact 
that people change their behaviour when they 
know that they are being observed. To discount 
this effect, Engle and colleagues recommended 
that studies should use active control groups in 
which participants perform activities that are 
identical to the training in every aspect except 
the main task. For example, although Jaeggi and 
her team’ tested their control group before and 
after training, they did not give them additional 
exercises; they accounted for ‘test-retest’ effects 
(people do better the second time around), but 
not phenomena such as the Hawthorne effect. 

The second issue concerns the use of only 
one measure of an outcome. Participants can 
develop strategies during training that aid their 
performance ona task — for example, inner 
vocalization of a visual stimulus — without any 
change in their underlying ability. And, because 
no task taxes only one cognitive ability, practis- 
ing one exercise can lead to improvements on 
seemingly unrelated tasks if there is overlap 
between the abilities that they engage. To avoid 
this possibility, the consensus among cognitive 
psychologists is that researchers should use a 
range of tasks that cover, for example, numeric, 
verbal and visuospatial abilities. Most studies 
have not done this. 

There have now been several attempts to clar- 
ify the picture by pooling study results. In 2013, 
psychologists Monica Melby-Lervag of Oslo 


THE GRANGER COLLECTION/TOPFOTO 


A TEST OF SIGHT AND SOUND 


& 
ca 
OD 


In the dual n-back 
test, participants 
are presented with 
an audio anda 
visual cue 
simultaneously. 
Participants must 
remember both 
and determine 
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University and Charles Hulme at University 
College London produced a meta-analysis’ that 
included 23 studies of adaptive working-mem- 
ory training, each lasting at least 2 weeks. They 
found a small far-transfer effect on non-verbal 
reasoning — but only in studies that used pas- 
sive control groups. In another meta-analysis* 
of 20 studies that only looked at n-back train- 
ing, psychologist Jacky Au of the University of 
California, Irvine, Jaeggi and their colleagues 
found a statistically significant positive effect 
on fluid intelligence — albeit small. “There is 
very good evidence that training on working- 
memory tasks does improve performance on 
tests of fluid intelligence,’ says Au. But, he adds, 
it is not yet clear whether this translates to a 
meaningful, real-world increase in intelligence. 

The analyses did little to settle the matter and 
the two groups have since exchanged further 
critiques”°. Hulme and Melby-Lervag have 
gone so far as to call for journals to stop pub- 
lishing studies that use passive controls”. “No 
analysis can get over having a passive control 
group, says Hulme. “The only way round it is 
to have a control group that’s doing something 
very similar, with the same expectations that 
it’s beneficial.” But training studies are time- 
consuming and expensive; doubly so when 
using active controls. “Best practice is to use 
active controls, Au admits, “but our work sug- 
gests the cost-benefit profile is not always clear, 
especially given the current state of funding.” 

Meta-analysis is a powerful tool for finding 
a consensus in a confusing picture. However, 
it cannot make up for the design flaws of the 
constituent studies, says psychologist Claudia 
von Bastian of the University of Colorado, 
Boulder. And many brain-training studies have 
weaknesses. For example, the average number 
of participants per training group in Au’s meta- 
analysis was 20, meaning that most studies did 
not have enough participants for their results 
to be reliable — particularly when it comes 
to detecting small effects. This low statistical 
power not only increases the risk of finding 
spurious effects, but it also tends to inflate the 
size of any effect found. 

Neither can meta-analysis adjust for stud- 
ies that use a single (and therefore inadequate) 
measure of an outcome. Using multiple meas- 
ures can enhance the validity of results, but 
some researchers advocate going further. 
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“There are statistical methods that allow you to 
analyse change in cognitive abilities instead of 
changes in test scores,’ says cognitive neurosci- 
entist Ulman Lindenberger of the Max Planck 
Institute for Human Development in Berlin. 
These ‘latent variables’ are inferred from the 
analysis of a battery of observed measures, and 
represent shared variance in those measures 
— in other words, changes in some underlying 
ability, such as reasoning or memory. Psycholo- 
gist Florian Schmiedek of the German Insti- 
tute for International Educational Research in 
Frankfurt, and his co-authors found that less 
than one-quarter of brain-training studies used 
multiple outcome measures, with only 7% using 
latent variables’. Schmiedek urges researchers 
to use more robust methods. “I don’t think this 
case will be closed by just running more of the 
kinds of studies that went into the recent meta- 
analyses; he says. “The few studies that went to 
such effort paint a modest picture of the effec- 
tiveness of cognitive training.” 


BACK TO BASICS 

What is clear is that the effect of working- 
memory training on fluid intelligence lies 
somewhere between zero and very small, 
depending on whose analysis you trust. Taking 
the optimistic view, the question becomes one 
of opportunity cost: how much time and effort 
is needed to produce a lasting effect, and how 
does this compare to other uses of that time? 
Other activities purported to improve cognition 
include exercise, musical training and learning 
anew language. Unfortunately, studies of these 
activities often have similar problems to brain 
training, says von Bastian, from choosing valid 
controls to difficulties in untangling causality — 
and results are similarly contested. 

Other researchers are examining the prob- 
lem from a biological perspective. Proponents 
of brain training are fond of pointing out that 
the brain remains plastic throughout life. But, a 
degree of stability is also essential. Lindenberger 
and cognitive neuroscientist Martin Lévdén of 
the Karolinska Institute’s Aging Research Center 
in Stockholm, argue that the brain strikes a bal- 
ance between plasticity and stability, and that 
this shifts as we age. From this perspective, 
changing something as fundamental as intelli- 
gence in an adultis a big deal. “Everything from 
a theoretical perspective suggests that to move 
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intelligence you would need to do something 
massive,’ says Lévdén. 

Some studies have looked for evidence of 
such changes at a neural level. The problem is 
that neuroscientists do not yet fully understand 
how everyday experience affects the brain. Dur- 
ing learning, both activity and volume in parts 
of the brain initially increase, but then decrease 
—and the time course is not clear, says Lovdén. 
“We don't have the link yet between experience, 
the brain and behavioural change, he says. “We 
need to take a step back and try to understand 
the basic science.” 

Von Bastian also advocates a return to the 
fundamentals. She is focused on developing a 
greater understanding of working memory — 
what its components are, the extent to which 
each might be malleable and how these com- 
ponents might affect other cognitive abili- 
ties. “That might be very interesting as a way 
to experimentally look at the relationship 
between working memory and intelligence,” 
she says. Research using this kind of theory- 
driven design, however, has been lacking. 

To facilitate the sharing of training tasks, 
protocols and data, von Bastian has developed 
a web-based, open-source software package, 
Tatool. The aim is to stimulate “methodologi- 
cally rigorous research” with huge sample sizes 
that is unbiased by commercial products, she 
says. So far, more than 100 researchers are 
signed up, which, she says, should bring some 
newideas and methodology to help explore the 
questions around whether brain training really 
works. “If we keep doing bad studies with small 
samples, we'll never know.’ = 


Simon Makin is a freelance science writer 
based in London. 


1. Klingberg, T., Forssberg, H. & Westerberg, H. J. Clin. 
Exp. Neuropsychol. 24, 781-791 (2002). 

2. Jaeggi, S. M., Buschkuehl, M., Jonides, J. & Perrig, 
W. J. Proc. Natl Acad. Sci. USA 105, 6829-6833 
(2008). 

3. Melby-Lervag, M. & Hulme, C. Dev. Psychol. 49, 

270-291 (2013). 

4. Au, J., Sheehan, E., Tsai, N., Duncan, G. J., 

Buschkuehl, M. & Jaeggi, S. M. Psychon. Bull. Rev. 

22, 366-377 (2015). 

5. Melby-Lervag, M. & Hulme, C. Psychon. Bull. Rev. 

http://doi.org/bb9d (2015). 

6. Au, J., Buschkuehl, M., Duncan, G. J. & Jaeggi. S. M. 

Psychon. Bull. Rev. http://doi.org/bb9f (2015). 

7. Noack, H., Lovdén, M. & Schmiedek, F. Psychol. Res. 
78, 773-789 (2014). 


3 MARCH 2016 | VOL 531 | NATURE | S11 


Early humans who hunted animals for meat developed bigger brains than plant eaters. 


BRAIN FOOD 


Clever eating 


Consumption of animals helped hominins to grow bigger 
brains. But ina world rich with food, how necessary is meat? 


BY SUJATA GUPTA 


round 6 million years ago, primates 
Aw moving from tropical forests 

into the savannahs. Unlike today, 
these prehistoric expanses were humid 
and probably provided a year-round sup- 
ply of fruit and vegetables. But then, some 
3 million years ago, the climate changed and 
the savannahs — along with their plentiful 
food supply — dried up. 

Many mammals, including some primates, 
went extinct, but others adapted. Archae- 
ologists working at sites in modern Ethio- 
pia have discovered animal remains that 
date back almost 2.6 million years. The 
telltale cut marks on their bones are almost 
certainly signs of butchery’, says Manuel 
Dominguez-Rodrigo, a palaeoanthropologist 
at Complutense University in Madrid. 

Only two types of primate survived the cli- 
mate catastrophe, says Dominguez-Rodrigo. 
There was a “plant-processing machine on 
the one hand and a meat-eating machine on 
the other hand”, he says. “The meat-eating 
machine evolved a bigger brain.” 

The meat-eating machine became us. 
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To build and maintain a more complex brain, 
our ancestors used ingredients found primarily 
in meat, including iron, zinc, vitamin B12 and 
fatty acids. Although plants contain many of 
the same nutrients, they occur in lower quan- 
tities and often in a form that humans cannot 
readily use. For instance, red meat is rich in 
iron derived from haemoglobin, which is more 
easily absorbed than the non-haem form found 
in beans and leafy greens. Furthermore, com- 
pounds known as phytates bind to the iron in 
plants and block its availability to the body. As 
a result, meat is a much richer dietary source of 
iron than any plant food (see ‘Meat efficiency’). 
“You would need to eat a massive amount of 
spinach to equal a steak,” says Christopher 
Golden, an ecologist and epidemiologist at Har- 
vard University in Cambridge, Massachusetts. 
The implications for cognitive health are 
huge. There is a clear, but underappreciated 
link between meat and the mind, says 
Charlotte Neumann, a paediatrician at the 
University of California, Los Angeles, who 
has studied meat eating in Africa and India 
for the past three decades. Deficiencies in 
the micronutrients found in meat have been 
linked with brain-related disorders, including 
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low IQ, autism, depression and dementia. Iron 
is crucial for the growth and branching of neu- 
rons while in the womb; zinc is found in high 
concentrations in the hippocampus, a crucial 
region for learning and memory; vitamin B12 
maintains the sheaths that protect nerves; and 
omega-3 fatty acids such as docosahexaenoic 
acid (DHA) help to keep neurons alive and to 
regulate inflammation. 


MEAT FOR THE POOR 

In the 1980s, researchers began to suspect that a 
lack of meat in some poor rural villages was con- 
tributing to a spectrum of childhood problems, 
including short stature, weakened immunity, 
social difficulties and poor school performance. 
When researchers from five universities stud- 
ied the effects of chronic malnourishment 
in Mexico, Kenya and Egypt, they found that 
children who consumed the greatest amount 
of meat and dairy products scored highest on 
physical, cognitive and behavioural tests, par- 
ticularly in Kenya’. But was the absence of meat 
really to blame? What the researchers needed 
was a controlled study. 

So Neumann began a trial in Kenya’. Her 
team selected 12 schools with children aged 
6 to 14, and gave some of the children mid- 
morning snacks. Schools were divided into four 
groups: the control group was not given a snack, 
whereas the other three received variations on 
githeri, a traditional porridge that consists of 
maize (corn), beans and greens. One group 
received a basic version, the second received the 
basic githeri with a glass of milk, and the third 
had meat added; all githeri were balanced to 
contain the same amount of calories. The study 
continued for more than 2 years and spanned 
2 cohorts, the first with 525 students and the 
second with 375. The students’ physical health 
and classroom performance were measured 
every three or six months. Compared with the 
other groups, students in the meat group had 
greater muscle mass and fewer health problems, 
and even showed greater leadership in the play- 
ground. Cognitive performance was stronger, 
too: the meat group outperformed other groups 
in maths and language subjects’, 

Neumann was not surprised by the results. 
The typical diet in rural Kenya is subsistence- 
based and does not include many nutrients 
that help the brain to grow. The challenge now 
is to get people to consume more meat, which is 
widely regarded as too expensive. What people 
don't realize, Neumann says, is that to nourish 
the brain, pretty much any animal matter will 
do: “Meat can be a worm, caterpillar or termite. 
It doesn't have to be butcher meat.” 


MEAT FOR THE RICH 

But how does meat fit into a richer diet? “A 
lot of the studies that have demonstrated the 
importance of meat, vitamin B, animal prod- 
ucts and protein generally have been carried 
out in populations receiving very little nutri- 
tion,” says Diane Hosking, a healthy-ageing 
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MEAT EFFICIENCY 


To reach the recommended daily intake of 18 milligrams of iron, a woman would have 
to eat at least 8 times more spinach than cooked liver. Iron found in vegetables is also 
harder for the body to absorb, because it is usually bound to fibre. 


<at> <file> 


Cooked bovine liver Cooked beef 
300g 625g 


Cooked lentils/chickpeas 
700g 


Cooked kidney/butter beans 
810g 


Spinach 
2.4 kg 


Cooked peas 
1.2 kg 


These data are approximate and will vary depending on factors such as preparation technique, soil or feeding conditions, and time between harvesting and intake. 
Analysis by F. Mori Sarti based on data from http://ndb.nal.usda.gov and http://www.unicamp.br/ 


researcher at the Australian National Univer- 
sity in Canberra. 

To fill this gap, Hosking and her team asked 
352 Australians aged between 65 and 90 years 
old — who were cognitively healthy and pre- 
dominantly from a middle- or high-income 
background — to recall what sorts of food 
they ate growing up”®. For instance, how often 
did they eat items such as carrots, meat, fish 
or cake? The researchers then administered 
cognitive tests. 

Hosking found no correlation between the 
volunteers’ test performance and their con- 
sumption of meat as children. The results 
contradicted what Neumann and others have 
observed in developing countries. What's 
more, contrary to conventional wisdom, 
participants who consumed more fish during 
childhood and as adults were actually slower 
on measures of cognitive speed. (The fish 
might have contained neurocontaminants 
such as mercury, she says.) 

There are several issues that affect these 
results, says Hosking. One is that people don't 
eat single foods, but patterns of foods, mak- 
ing it difficult to tease out the importance of 
an individual food type, such as meat. In the 
older Australians for instance, those who ate 
meat were also more likely to consume pack- 
aged desserts and snack foods. 

Moreover, what the animal eats also 
matters. Livestock and poultry in Western 
nations are often raised in large facilities and 
fed diets that consist mainly of maize and 
soya, whereas animals from poor villages are 
typically farmed on a much smaller scale and 
forage for a greater variety of foods, which 
increases the nutrient content of their meat. 
Given these sorts of variations, Hosking says, 
“we have to be very cautious about making 
dietary recommendations ... for people who 
have access to large quantities of food.” 


MEAT IN THE BRAIN 

The micronutrients in meat have become an 
essential part of our diet over millennia. A few 
years ago, archaeologists in Tanzania unearthed 
fragments ofa child’s skull dating back 1.5 mil- 
lion years. Deformities on the bones suggested 
that the child had died from porotic hyper- 
ostosis, a condition thought to result from a 


deficiency in vitamin B12 — found exclusively 
in animal-derived foods. Humans started eat- 
ing dairy products only in the past 5,000 years, 
meaning that the child had almost certainly 
died from a lack of meat’. So, by at least 1.5 mil- 
lion years ago, says Dominguez-Rodrigo, 
humans had become so adapted to eating meat 
that without it they would die. 

Research is starting to provide some clues 
as to how meat helps the brain to function. 
Bradley Peterson, director of the Institute for 

the Developing Mind at 


“Meat can Children’s Hospital Los 
beaworm, Angeles in California, 
caterpillar has investigated why low 
or termite. It iron levels in children are 
doesn’thave correlated with lower IQ 
tobebutcher and poor concentration’. 
meat.” Using magnetic resonance 


imaging, Peterson and 
his colleagues mapped out what happened in 
the brains of newborn infants of 40 adolescent 
mothers — a group known to be at high risk for 
iron deficiency. Although most of the women 
reported taking prenatal vitamins with iron, 
58% had iron levels below normal and 14% met 
the criteria for mild anaemia. 

As the brain develops, says Peterson, neurons 
become increasingly complex, forming branch- 
like dendrites covered with spines — much like 
a growing tree. The brain images that his team 
took showed a correlation between neuron 
complexity in an infant and the amount of iron 
in the mother’s diet. “The higher the iron intake 
throughout pregnancy, the more mature or the 
more complex grey matter was at the time of 
birth,’ says Peterson, who is continuing to track 
the mothers and babies to see how those vari- 
ations play out. 

Beyond simple measures of micronutrient 
intake, individual requirements are also influ- 
enced by a person's genetics. So far, much of 
the research has focused on how people process 
omega-3 fatty acids, chiefly DHA and eicosa- 
pentaenoic acid (EPA), which are crucial for 
human cognitive health. 

Omega-3 fatty acids are found primarily in 
oily, wild fish, such as salmon and tuna, but 
pasture-raised animals are also a good source. 
(Animals fed only soya or maize have fewer 
omega-3s.) In 2012, researchers discovered that 
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most African populations, but not European 
populations, carried a variant of the FADS gene 
that made them more efficient at converting 
omega-3s in plants into a usable form, meaning 
that they required less from animal sources’. 
Conversely, a 2014 paper reported that people 
carrying a variant of the APOE gene (11-17% 
of US individuals of European descent) that 
confers a greater risk of developing late-onset 
Alzheimer’s disease, derived little benefit from 
eating fatty fish’®. “One size does not fit all 
around nutritional recommendations,” says 
Hosking. Put another way, the nutrients found 
in meat are important for health and cogni- 
tion, but only up to a point. “Meat packs a lot of 
minerals and vitamins in just a small amount of 
food,’ says Dominguez-Rodrigo. “Eating meat 
is like eating a power bar.” 

So the key question becomes how much meat 
should a cognitive-health-conscious person 
eat. Too little can delay development and cog- 
nition. But too much, particularly if it is low 
quality and mass produced, is associated with 
other health concerns, such as heart disease 
and cancer, along with memory problems later 
in life. A person's life stage matters: pregnant 
women need more iron, as do babies and chil- 
dren. Genetics also playa part, but we don’t yet 
know all the particulars. All these caveats make 
for a murky takeaway. = 


Sujata Gupta is a freelance science writer 
based in Burlington, Vermont. 


1. Dominguez-Rodrigo, M., Rayne Pickering, T., 
Semaw, S. & Rogers, M. J. J. Hum. Evol. 48, 
109-121 (2005). 

2. Neumann, C., Bwibo, N. O. & Sigman, M. 

Final Report Phase Il: Functional Implications 
of Malnutrition, Kenya Project. Nutrition CRSP. 
(University of California, Los Angeles, 1992). 

3. Neumann, C. G., Murphy, S. P., Gewa, C., 
Grillenberger, M. & Bwibo, N. O. J. Nutr. 137, 
1119-1123 (2007). 

. Hulett, J. L. et al. Br J. Nutr. 111, 875-886 (2014). 

. Hosking, D. E., Nettelbeck, T., Wilson, C. & 
Danthiir, V. Br J. Nutr. 112, 228-237 (2014). 

6. Hosking, D. & Danthiir, V. Br J. Nutr. 110, 2069- 

2083 (2013). 

7. Dominguez-Rodrigo, M. et al. PLoS ONE 7, e46414 
(2012). 

8. Monk, C. et al. Pediatric Res. http://dx.doi. 
org/10.1038/pr.2015.248 (2015). 

9. Mathias, R.A. et a/. PLoS ONE 7, e44926 (2012). 

10.Chouinard-Watkins, R. & Plourde, M. Nutrients 6, 
4452-4471 (2014). 


ast 


3 MARCH 2016 | VOL 531 | NATURE | $13 


Activities that build relationships are good for the brain. 


SOCIAL NETWORKS 


Better together 


Social ties go hand-in-hand with cognitive health. Now 
researchers are trying to determine why engaging with 
others helps to keep the brain healthy. 


BY CHELSEA WALD 


hen Laura Fratiglioni looked at 
the data her team had collected on 
ageing and cognitive decline, she 


noticed something odd: after the age of 80, 
women in Stockholm were more likely than 
men to get dementia. Other research revealed 
the same trend outside of Sweden, too. Could 
it be that women’s brains were more prone 
to dementia than men’s, or was there some 
other aspect of their post-80 lifestyles caus- 
ing the decline? 

When they delved into the data, Fratiglioni 
and her colleagues found an irony: that 
women’s well-known longevity was also 
their undoing. “Women are more frequently 
alone when they reach the age of 80 or 85 
because women are more often married to 
older men — and in general, men die earlier 
than women,’ says Fratiglioni, who is a neu- 
rologist and director of the Aging Research 
Center of the Karolinska Institute in Stock- 
holm. And, as Fratiglioni’s team concluded 
in its seminal paper’, it is this isolation that 
is correlated with increased risk of dementia. 
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Since this work in the 1990s, the link has 
grown clearer. Several longitudinal studies, as 
well as brain studies, have shown that social 
ties are associated with better cognition. It is 
as though our friends and family can make 
us brighter, perhaps by stimulating think- 
ing and reducing stress. If that is true, then 
working on our social networks when we are 
young could pay off later, by delaying both 
the normal decline that accompanies ageing 
and the pathological deterioration associ- 
ated with dementia. Researchers are trying 
to work out which aspects of our networks 
do the most good, and what mechanisms link 
our social lives and our biology. 


RICH RELATIONSHIPS 
Over the past two decades, scientists have 
found associations between richer social ties 
— multiple links with friends, family and the 
community — and a variety of positive health 
outcomes, from reduced susceptibility to the 
common cold to greater life expectancy. 
Initially, Fratiglioni and colleagues were 
uncertain whether their findings were real or 
whether they could be explained by reverse 
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causation. That is, because participants were 
followed for only three years on average, the 
team may have missed subtle, early signs of 
dementia that were already affecting rela- 
tionships at the outset, meaning that when 
dementia became apparent it would seem to 
have been caused by degraded networks. But, 
ina later analysis” volunteers were followed 
for longer. Those who developed the disease 
did so about six years after monitoring began, 
making it less likely that undetected symp- 
toms were biasing the results. The authors 
reached a similar conclusion: that the social 
component of any leisure activity protected 
against dementia as much as the mental and 
physical components did. 

Other groups have also found support for 
a causal link. A six-year study in the United 
States by a team at the Harvard School of 
Public Health in Boston, Massachusetts, 
found that social integration — assessed by 
marital status, volunteer activity and fre- 
quency of contact with family and neigh- 
bours — helped to delay memory loss in 
older people’. The team found that memory 
among the least integrated declined twice as 
fast as it did for the most integrated. And they 
found no evidence of reverse causation. 

The link could also apply to younger peo- 
ple. An analysis of survey data collected from 
35- to 85-year-old Americans found that, 
at any age, people with more contacts and 
social support performed better on tests of 
executive function and memory*. Although 
this conclusion is debated, the researchers 
suggest that any relationship is likely to be 
reciprocal — cognitive function affects social 
engagement and vice versa. 


DIGGING DEEP 
Because studies use a variety of measures for 
both social network and cognitive function, the 
reason for the link has been hard to pin down. 
Almost by definition, people with strong social 
networks tend to have more access to informa- 
tion, resources, and assistance and advice from 
other people than do those who are isolated. But 
social support — even if it is well meaning — 
can also have negative effects on well-being, for 
example when it is perceived to be intrusive or 
controlling. “The degree and quality with which 
we have our social interactions impacts our 
entire brain and body,’ says cognitive psychol- 
ogist Timothy Verstynen at Carnegie Mellon 
University in Pittsburgh, Pennsylvania. 
Perhaps that shouldn't be surprising. Social 
networks have existed long before sites such as 
Facebook, and it seems that they have been an 
essential part of being human since our brains 
began tripling in size some 2 million years 
ago. The social-brain hypothesis, proposed by 
anthropologist Robin Dunbar of the University 
of Oxford, UK, states that primates’ dispropor- 
tionately large brains evolved to handle the 
complex demands of social living, with human 
brains being the most disproportionate of all. 


JONATHAN BLAIR/CORBIS 


SOURCE: W.-X. ZHOU ETAL. PROC. BIOL. SCI. 272, 439-444 (2005). 


The link between social network, cognition 
and brain size occurs not only at the species 
level, but also at the individual level. Dunbar, 
working with psychologist James Stiller at 
Nottingham Trent University, UK, found a 
correlation between the size of a person's net- 
work and their performance on tests of both 
memory and ‘theory of mind’ — the ability to 
understand another person’s thoughts’. Dun- 
bar, with US and UK colleagues, also found 
that the grey-matter volume of parts of the pre- 
frontal cortex vary with social-network size, as 
well as with performance on theory-of-mind 
tasks®. The prefrontal cortex is essential for 
social cognition: it handles information pro- 
cessing, planning, working memory, language 
and attention. Further research by Dunbar and 
others revealed a standardization to human 
social groups, from an intimate support clique 
of 3-5 individuals to a broad active network of 
150 people (see ‘Social animals’). 

A similar link has been found for other 
brain regions. The volume of the amygdala, 
the almond-shaped emotion centre deep in the 
brain, correlates with the size and complexity of 
a person's social network. And grey-matter den- 
sity in certain parts of the temporal lobe, which 
is associated with social perception and associa- 
tive memory, has been found to vary according 
to the size of volunteers’ Facebook networks’. 
Some researchers, including Fratiglioni, suspect 
that the cognitively demanding act of socializing 
can actually build up the brain — like exercising 
builds up muscles. This ‘brain reserve’ may then 
act as a buffer against functional loss, even in the 
face of conditions such as Alzheimer’s disease. 

Whether this social stimulation can form 
the basis of a medical intervention is under 
investigation. Last year, researchers led by ger- 
ontologist and biostatistician Hiroko Dodge of 
Oregon Health & Science University in Portland 
hooked up internet-based face-to-face commu- 
nications systems in volunteers’ homes. Using 
the system, participants, whose average age was 
just over 80, spoke with trained interviewers for 
30 minutes a day for 6 weeks. After the interven- 
tion, volunteers performed better in language- 
based tests of executive function, for example, 
when they were asked to name as many words 
as possible that belonged to a certain category’. 
“Surprisingly, we found a big effect; Dodge says. 

Dodge hopes to run a larger and longer 
study, with the goal of showing that such 
short conversations might delay the onset of 
dementia. She is optimistic about the uptake 
— should this simple system prove successful. 
Unlike physical and cognitive exercises that 
require special effort, she says, “talking with 
people is so natural”. 


INFLAMED THINKING 

There is, however, more to the brain than 
the number of little grey cells. White matter 
functions like “wires that connect brain areas 
together’, supporting many aspects of cogni- 
tion, says Verstynen. White-matter integrity is 


SOCIAL ANIMALS 


COGNITIVE HEALTH (iejtyy mele) 


In primates, the size of social circles correlates to the relative size of the neocortex compared with the rest of the 
brain (1). In humans, Dunbar’s number is a suggestion of the amount of relationships a person can maintain. 
Human social networks seem to be split into hierarchies (2, 3), with group sizes varying by a factor of about three. 
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500 near acquaintances 


known to correlate with cognitive performance, 
and, in 2014, Verstynen’s team revealed a link 
between white-matter integrity and the rich- 
ness of a person's web of social interactions’. 
“Social-network diversity is affecting the effi- 

ciency of those wires,’ says Verstynen. 
To explain the link between social net- 
works and white 


“The degree matter, Verstynen 
and quality turns to inflamma- 
with which we tion. Isolated indi- 
have our social — viduals have higher 
interactions levels of inflamma- 
impacts our tion than those who 
entire brain live in a social milieu; 
and body.” this inflammation 


is comparable with 
that of those who smoke or who are obese. 
“The greater your inflammation in the body, 
the more you inflame myelin in the brain,” 
Verstynen says, referring to the sheaths that 
protect nerves, “and sometimes that leads to 
degradation of the myelin.” 

If this is true, Verstynen says, then isolation 
could create a feedback loop: a reduction in 
social network diversity could raise levels of 
inflammation, damaging the white matter. 
That could lead to poor decision making, 
which in turn could lead to further shrink- 
age of the social network. In other words, los- 
ing friends could alter brain biology in a way 
that leads to losing even more friends. This 
might sound bleak, but “there is hope,’ says 
Verstynen. He suggests that interventions that 
increase someone's social network diversity or 
reduce systemic inflammation “might be able 
to break this circuit and improve brain health”. 

Online networks may help to decrease isola- 
tion, particularly for those with limited mobil- 
ity. In a study of older people living alone, 
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1,500 people can be recognized 


12-15 close 
relationships 
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45-50 
good 
friends 
150 is the maximum number of friends 

(Dunbar’s number) 


5,300 Plato’s ideal size for 
a democracy 


medical sociologist Shelia Cotten of Michigan 
State University found that using the Internet 
helped to reduce feelings of loneliness and 
increase connections with friends and fam- 
ily”. Social ties, including online ones, have the 
potential to “provide emotional support and 
also provide informational support that help 
people make important life decisions’, she says. 

Sites such as Facebook could be as impor- 
tant for a healthy lifestyle as the gym, so long 
as it helps to build and maintain a strong 
network. “I tell people, it’s very important, 
especially when we are getting older, to 
continue to be active in all three compo- 
nents — that is, mental, physical and social,” 
Fratiglioni says. Five years ago, she decided to 
follow her own advice and took up tango. The 
dance style is “unbelievable’, she says — “very 
social, very emotional, very physical”. In other 
words, it’s the perfect medicine. m 


Chelsea Wald is a freelance science writer 
based in Vienna. 
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Abstractions of the mind 


Before data were so abundant, computer models of the brain were simple. Information is now 


much more plentiful — 


BY KELLY RAE CHI 


r | Vhe first major results of the Blue Brain 
Project, a detailed simulation of a bit of 
rat neocortex about the size of a grain 

of coarse sand, were published last year’. The 

model represents 31,000 brain cells and 37 mil- 

lion synapses. It runs on a supercomputer and 

is based on data collected over 20 years. Fur- 
thermore, it behaves just like a speck of brain 
tissue. But therein, say critics, lies the problem. 

“It’s the best biophysical model we have of any 

brain, but that’s not enough,” says Christof 

Koch, a neuroscientist at the Allen Institute for 

Brain Science in Seattle, Washington, which 

has embarked on its own large-scale brain- 

modelling effort. The trouble with the model is 
that it holds no surprises: no higher functions 
or unexpected features have emerged from it. 
Some neuroscientists, including Koch, say 
that this is because the model was not built 
with a particular hypothesis about cognitive 
processes in mind. Its success will depend on 
whether specific questions can be asked of 
it. The irony, says neuroscientist Alexandre 

Pouget, is that deriving answers will require 

drastic simplification of the model, “unless we 

figure out how to adjust the billions of param- 
eters of the simulations, which would seem 
to be a challenging problem to say the least”. 

By contrast, Pouget’s group at the University 

of Geneva, Switzerland, is generating and 
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testing hypotheses on how the brain deals with 
uncertainty in functions such as attention and 
decision-making. 

There is a widespread preference for 
hypothesis-driven approaches in the brain- 
modelling community. Some models might be 
very small and detailed, for example, focusing 
on a single synapse. Others might explore the 
electrical spiking of whole neurons, the com- 
munication patterns between brain areas, or 
even attempt to recapitulate the whole brain. 
But ultimately a model needs to answer ques- 
tions about brain function if we are to advance 
our understanding of cognition. 


FROM TOP TO BOTTOM 
Blue Brain is not the only sophisticated model 
to have hit the headlines in recent years. In 
late 2012, theoretical neuroscientist Chris 
Eliasmith at the University of Waterloo in 
Canada unveiled Spaun, a whole-brain model 
that contains 2.5 million neurons (a fraction 
of the human brain’s estimated 86 billion). 
Spaun hasa digital eye and a robotic arm, and 
can reason through eight complex tasks such 
as memorizing and reciting lists, all of which 
involve multiple areas of the brain’. Neverthe- 
less, Henry Markram, a neurobiologist at the 
Swiss Federal Institute of Technology in Laus- 
anne who is leading the Blue Brain Project, 
noted’ at the time: “It is not a brain model.” 
Although Markram’s dismissal of Spaun 
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but some argue that models should remain uncomplicated. 


amused Eliasmith, it did not surprise him. 
Markram is well known for taking a different 
approach to modelling, as he did in the Blue 
Brain Project. His strategy is to build in every 
possible detail to derive a perfect imitation of 
the biological processes in the brain with the 
hope that higher functions will emerge — a 
‘bottom-up approach. Researchers such as 
Eliasmith and Pouget take a ‘top-down strat- 
egy, creating simpler models based on our 
knowledge of behaviour. These skate over 
certain details, instead focusing on testing 
hypotheses about brain function. 

Rather than dismiss the criticism, Eliasmith 
took Markram’s comment on board and added 
bottom-up detail to Spaun. He selected a 
handful of frontal cortex neurons, which were 
relatively simple to begin with, and swapped 
them for much more complicated neurons — 
ones that account for multiple ion channels 
and changes in electrical activity over time. 
Although these complicated neurons were 
more biologically realistic, Eliasmith found 
that they brought no improvement to Spaun’s 
performance on the original eight tasks. “A 
good model doesn’t introduce complexity for 
complexity’s sake,” he says. 


SIMPLIFY, SIMPLIFY, SIMPLIFY 

For many years, computational models of the 
brain were what theorists call unconstrained: 
there were not enough experimental data to 
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map onto the models or to fully test them. 
For instance, scientists could record electrical 
activity, but from only one neuron at a time, 
which limited their ability to represent neural 
networks. Back then, brain models were simple 
out of necessity. 

In the past decade, an array of technologies 
has provided more information. Imaging tech- 
nology has revealed previously hidden parts of 
the brain. Researchers can control genes to iso- 
late particular functions. And emerging statisti- 
cal methods have helped to describe complex 
phenomena in simpler terms. These techniques 
are feeding newer generations of models. 

Nevertheless, most theorists think that a 
good model includes only the details needed 
to help answer a specific 


question. Indeed, one “Wemake 

of the most challeng- best progress 
ing aspects of model if wefocus 
building is working out on specific 
which details areimpor- elements 

tant to include and of neural 


which are acceptable to 
ignore. “The simpler the 
model is, the easier it is to analyse and under- 
stand, manipulate and test,” says cognitive and 
computational neuroscientist Anil Seth of the 
University of Sussex in Chichester, UK. 

An oft-cited success in theoretical 
neuroscience is the Reichardt detector — a 
simple, top-down model for how the brain 
senses motion — proposed by German physi- 
cist Werner Reichardt in the 1950s. “The big 
advantage of the Reichardt model for motion 
detection was that it was an algorithm to begin 
with,’ says neurobiologist Alexander Borst of 
the Max Planck Institute of Neurobiology in 
Martinsried, Germany. “It doesn’t speak about 
neurons at all.” 

When Borst joined the Max Planck Society 
in the mid-1980s, he ran computational simula- 
tions of the Reichardt model, and got surprising 


computation.” 


results. He found, for instance, that neurons 
oscillated when first presented with a pattern 
that was moving at constant velocity — a result 
that he took to Werner Reichardt, who was 
also taken aback. “He didn't expect his model 
to show that,’ says Borst. They confirmed the 
results in real neurons, and continued to refine 
and expand Reichardt’s model to gain insight 
into how the visual system detects motion. 

In the realm of bottom-up models, the 
greatest success has come from a set of equa- 
tions developed in 1952 to explain how flow 
of ions in and out of a nerve cell produces an 
axon potential. These Hodgkin-Huxley equa- 
tions are “beautiful and inspirational’, says 
neurobiologist Anthony Zador of Cold Spring 
Harbor Laboratory in New York, adding that 
they have allowed many scientists to make 
predictions about how neuronal excitability 
works. The equations, or their variants, form 
some of the basic building blocks of many of 
today’s larger brain models of cognition. 


GAMBLE IN DETAILS 

Although many theoretical neuroscientists 
do not see value in pure bottom-up 
approaches such as that taken by the Blue 
Brain Project, they do not dismiss bottom-up 
models entirely. These types of data-driven 
brain simulations have the benefit of remind- 
ing model-builders what they do not know, 
which can inspire new experiments. And 
top-down approaches can often benefit from 
the addition of more detail, says theoretical 
neuroscientist Peter Dayan of the Gatsby 
Computational Neuroscience Unit at Uni- 
versity College London. “The best kind of 
modelling is going top-down and bottom-up 
simultaneously,’ he says. 

Borst, for example, is now approaching the 
Reichardt detector from the bottom up to 
explore questions such as how neurotrans- 
mitter receptors on motion-sensitive neurons 
interact. And Eliasmith’s more complex Spaun 
has allowed him to do other types of experi- 
ment that he couldn't before — in particular, he 
can now mimic the effect of sodium-channel 
blockers on the brain. 

Also taking a multiscale approach is 
neuroscientist Xiao-Jing Wang of New 
York University Shanghai in China, whose 
group described a large-scale model of the 
interaction of circuits across different regions 
of the macaque brain’. The model is built, in 
part, from his previous, smaller models of local 
neuronal circuits that show how neurons in a 
group fire in time. To scale up to the entire 
brain, Wang had to include the strength of 
the feedback between areas. Only now has he 
got the right data — thanks to the burgeoning 
field of connectomics (the study of connection 
maps within an organism’s nervous system) — 
to build in this important detail, he says. Wang 
is using his model to study decision-making, 
the integration of sensory information and 
other cognitive processes. 
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In physics, the marriage between experiment 
and theory led to the development of unifying 
principles. And although neuroscientists might 
hope for a similar revelation in their field, the 
brain (and biology in general) is inherently 
more noisy than a physical system, says com- 
putational neuroscientist Gustavo Deco of the 
Pompeu Fabra University in Barcelona, Spain, 
whois an investigator on the Human Brain Pro- 
ject. Deco points out that equations describing 
the behaviour of neurons and synapses are non- 
linear, and neurons are connected in a variety 
of ways, interacting in both a feedforward and a 
feedback manner. That said, there are examples 
of theory allowing neuroscientists to extract 
general principles, such as how the brain bal- 
ances excitation and inhibition, and how neu- 
rons fire in synchrony, Wang says. 

Complex neuroscience often requires 
huge computational resources. But it is not 
a want of supercomputers that limits good, 
theory-driven models. “It is a lack of knowl- 
edge about experimental facts. We need more 
facts and maybe more ideas,” Borst says. 
Those who crave vast amounts of computer 
power misunderstand the real challenge 
facing scientists who are trying to unravel 
the mysteries of the brain, Borst contends. 
“T still don't see the need for simulating one 
million neurons simultaneously in order to 
understand what the brain is doing,” he says, 
referring to the large-scale simulation linked 
with the Human Brain Project. “’m sure we 
can reduce that to a handful of neurons and 
get some ideas.” 

Computational neuroscientist Andreas 
Herz, of the Ludwig-Maximilians University in 
Munich, Germany, agrees. “We make best pro- 
gress if we focus on specific elements of neural 
computation,” he says. For example, a single 
cortical neuron receives input from thousands 
of other cells, but it is unclear how it processes 
this information. “Without this knowledge, 
attempts to simulate the whole brain in a seem- 
ingly biologically realistic manner are doomed 
to fail? he adds. 

At the same time, supercomputers do allow 
researchers to build details into their models 
and see how they compare to the originals, as 
with Spaun. Eliasmith has used Spaun and its 
variations to see what happens when he kills 
neurons or tweaks other features to investigate 
ageing, motor control or stroke damage in the 
brain. For him, adding complexity to a model 
has to serve a purpose. “We need to build big- 
ger and bigger models in every direction, more 
neurons and more detail,” he says. “So that we 
can break them. = 


Kelly Rae Chi is a freelance science writer 
based in Cary, North Carolina. 


1. Markram, H. et a/. Cel! 163, 456-492 (2015). 

2. Eliasmith, C. et al. Science 338, 1202-1205 (2012). 

3. Sanders, L. Science News http://go.nature.com/ 
j1t8lu (2012). 

4. Chaudhuri, R., Knoblauch, K., Gariel, M.-A., Kennedy, 
H. & Wang, X.-J. Neuron 88, 419-431 (2015). 


3 MARCH 2016 | VOL 531 | NATURE | $17 


i, ; 


uw 4 


NEUROBIOLOGY 


NW 
27 
<i 


A 


WH 


a 
“NG 
awa 


Rise of resilience 


Stress can have a negative influence on the human brain, but increasingly it is the ability to 
withstand severe stress that is the focus of research. 


BY ANTHONY KING 


aniela Kaufer has a personal interest in 
D the effects of stress. “My mum’s family 

had a very traumatic experience when 
their mother died in childbirth,’ she explains. 
The three children grew up motherless, in 
1950s war-torn Israel, but there was a marked 
difference in how well the siblings coped. “My 
mum had an extremely difficult early life,” she 
says. “Yet she is extremely resilient.” Kaufer, 
who is a neuroscientist at the University of 
California, Berkeley, says that why her mother 
in particular coped so well has fascinated her. 

Research into how people react to early 
trauma began in earnest after the Second 
World War. Distressing events such as the 
death of a parent have been found to increase 
children’s short-term risk of major depression, 
anxiety disorders and post-traumatic stress 
disorder (PTSD). With advances in tech- 
niques to study genes and to explore the brain, 
the neurobiological study of stress is under- 
going a revolution — and our view of the stress 
response is changing. Until about 20 years ago, 
the absence of a severe negative reaction such 
as PTSD was thought to be a lack of response. 
Instead, “resilience is now viewed as a reactive 
response’, says Kaufer. 

What resilience means in terms of gene 
expression, numbers of cells and brain net- 
works is now the focus of research. “Years 
ago, most would have thought that resilient 
individuals escape some of the bad things that 
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stress induces in the brains of more susceptible 
individuals,” explains neurobiologist Eric 
Nestler at Mount Sinai School of Medicine, 
New York. “Now we believe susceptible indi- 
viduals lack some of the more adaptive changes 
that occur in the resilient brain” 

Stress is an unavoidable component of life, 
and the stress response is a crucial survival 
mechanism. “The brain is a detector of threaten- 
ing information,’ says neuropsychologist Sonia 
Lupien at the University of Montreal, Canada. 
“This is its most important job if you want to 
survive.’ Acute stress readies us for action, but 
chronic stress wears us down, altering the brain 
genetically and neurologically and priming us 
for mental health problems. Whether a person 
is resilient to stress depends heavily on their life 
history. Understanding the effects of early-life 
difficulties could provide new ways to treat or 
prevent mental illnesses such as severe depres- 
sion or PTSD in susceptible individuals. 


STRESS IN THE BRAIN 

Confronted with a life-threatening situation, 
hormones and neurotransmitters prep us for 
action. Specific stress hormones — cortisol 
in primates, corticosterone in most rodents 
— are released, some of which surge across 
the blood-brain barrier. Stress gets every- 
where: all our cells host receptors for the 
hormone. “Every brain area has something 
happen to it,” says Kaufer. The human brain 
has two types of receptor for cortisol. One 
has a six to tenfold higher affinity for the 
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molecule than the other, and so is activated 
earlier, by smaller amounts of cortisol. 

The hippocampus (which is pivotal for 
memory) and the amygdala (the centre for emo- 
tions) contain lots of the high-affinity receptors, 
and are, therefore, activated by slight rises in the 
hormone. The frontal lobe, which is involved 
in executive planning and control, has only the 
low-affinity receptor, and is activated later, after 
the tide has risen. And, as Lupien and colleagues 
found, both memory formation and recall in 
adults can be influenced by cortisol’. 

The existence of two receptor types means 
that response to stress is not linear. “The rela- 
tionship between circulating stress hormone 
and memory is an inverted U-shape function,” 
Lupien explains. “Up to a certain level, stress 
hormones are good for your memory” — when 
the cortisol binds only to the high-affinity 
receptors, the ability to lay down and retrieve 
memory is enhanced. When the low-affinity 
receptors are activated, the relationship enters 
the right-hand side of the U-shape and the 
response shifts, she adds. 

The duration of stress is also important. A 
transient bout of stress causes a proliferation 
of neural stem cells and a spike in numbers of 
new neurons, which take at least two weeks to 
mature. The brain seems to be preparing itself 
in case a second stressor comes calling. Chronic 
stress is not so beneficial. It slashes investment 
in new neurons, prunes the tree-like shape of 
existing ones, and suppresses new connections. 

If stress hormones remain elevated for 
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months or years, they can stimulate physio- 
logical changes: the hippocampus shrinks and 
the amygdala grows, for example. Eventually, 
the complex feedback system that suppresses 
the excess secretion of cortisol is disturbed. 
Once this happens, the capacity to discrimi- 
nate between threat levels falls away. Either 
everything seems threatening (anxiety) or else 
nothing does (depression or burnout). 


EARLY INFLUENCES 

Old ideas that certain individuals have an 
inherent ‘hardiness’ or an innate ability to 
bounce back from severe stress have fallen 
by the wayside. Instead, resilience and our 
response to trauma are recognized as being 
more dynamic, changing throughout life. It’s a 
complicated milieu, but one of the main ways 
that stress marks the brain is through epige- 
netics. This does not change genes, but it can 
change their expression by attaching methyl 
groups to DNA or associated proteins. 

At McGill University in Montreal, neuro- 
scientist Michael Meaney’s group has been 
exploring the stress response in rats. It found 
that high-licking and low-licking grooming 
strategies in mother rats gave rise to differ- 
ent offspring’. The lesser-grooming mums 
produced pups with high anxiety, poor stress- 
recovery and low cognitive performance. The 
pups brain circuits that switch off stress were 
sluggish, owing to higher DNA methylation 
and lower expression of the ‘off’ receptors in 
the hippocampus. Well-groomed pups showed 
the opposite. 

People might be tempted to label high- 
groomers as better mothers, says Kaufer. “But 
it is not ‘good mums’ or ‘bad mums; just a 
different parenting style.” Parenting style can 
reflect the environment and prepare the off- 
spring, she explains. Being a cautious, worried 
rat — the offspring of aless-thorough groomer 
— makes sense if you live in an alley full of cats. 

“The stress response is one of the most 
conserved things in evolution,” Kaufer says. 
This means that animal findings tend to be 
applicable to humans. However, what is good 
for avoiding predators may not be a healthy 
adaptation to the continued stress of modern 
life. Kieran O’Donnell, a neuroscientist in 
Meaney’s lab, says that the epigenetic changes 
in the anxious rat brains seem to have human 
analogies. “We see the same sorts of changes in 
DNA methylation of the hormone receptor in 
people who suffered child maltreatment,” he 
says. “The question is, can we intervene and do 
anything about this?” O'Donnell is currently 
investigating whether the children of vulner- 
able young mums who received regular nurse 
visits after they had given birth show changes 
in DNA methylation 27 years later. 

Chronic stress in early life stamps an espe- 
cially long-lasting mark on the brain. Some 
people carry an epigenetic signature of stress 
from exposure as a baby — or even as an 
embryo. Emotional stress in pregnant women, 
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Awhite ‘bully mouse’ intimidates a smaller mouse, which then develops signs of social withdrawal. 


for example, can alter their children’s epigenet- 
ics and the neural connections in their babies’ 
amygdalas. Later exposure to severe trauma can 
activate this signature, explains neurobiologist 
Alon Chen, of the Weizmann Institute of Sci- 
ence in Rehovot, Israel. He was involved in a 

study that reported that 


“The stress chronic stress can reduce 
response the number of hormone 
is one of receptors in the hip- 
the most pocampus needed to shut 
conserved off the stress response’. 
things in This means that individ- 


uals exposed to chronic 
stress early in life may be 
more susceptible to a stress-related disorder as 
an adult, owing to a disrupted feedback loop. 
“Tt is not healthy to stay in a stress situation for 
along time,” says Chen. “Returning to anormal 
level is an important part of the stress-response 
machinery: 


evolution.” 


BEATING THE BULLIES 

Stress affects our relationships with others 
(and socializing is itself an agent in cognitive 
health; see page S14). Nestler has devised a 
‘bully mouse’ scenario (pictured) in which, 
for 5 to 10 minutes a day over 10 days, a nor- 
mal mouse is placed in a cage that is already 
occupied by a larger, more aggressive strain 
of mouse that intimidates the incomer. At all 
other times, the mice are kept close enough 
to see and smell one another, but they are 
separated by mesh. Nestler’s team found 
that afterwards, some of the bullied mice 
avoid all social contact, even with smaller, 
non-aggressive mice. 

As with memory, the way that sociabil- 
ity changes with stress is not linear. Kaufer’s 
lab found that rats exposed to moderate 
stress — in this case, being immobilized in a 
bag — displayed more positive social behav- 
iour, such as huddling, resource sharing and 
reduced aggression’. The researchers also saw 
an increase in the prosocial hormone oxytocin. 
But if the immobilized rats were exposed to 
fox odour, the addition of this high-level-stress 
inducer caused them to lose all pro-social 
behaviours. Oxytocin plummeted, as did its 
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receptors. “This is really interesting because it 
can start to explain the social withdrawal that 
you can see in some psychopathologies like 
PTSD and depression,” Kaufer says. 

Yet even in these scenarios, some rodents did 
better than others. Nestler and his colleagues 
found that some of their mice were resilient 
to the bullying, and that these mice showed a 
greater number of gene-expression changes”. So 
resilience was a reaction, endowing the individ- 
uals with greater adaptability. “We don’t know 
the underlying cause,’ says Nestler, adding that 
the search is on for genes with altered expression 
in the brains of resilient individuals. His lab is 
now planning to look at the role of individual 
messenger RNAs and proteins in mediating 
resilience to try and tease out what is different 
in the cells and neural circuits of resistant mice. 

This research could deliver benefits for treat- 
ing stress-related disorders such as depression. 
Most antidepressant drug-discovery efforts 
have focused on ways to undo the bad effects 
of stress. “Understanding resilience offers an 
additional approach,” says Nestler, “to look for 
ways to induce mechanisms of natural resil- 
ience in those individuals who are inherently 
more susceptible.” 

Research into stress is changing how we 
view mental-health conditions. Epigenetic 
and brain-chemistry changes caused by life 
stresses can be reversed by activities such 
as exercise, yoga, meditation and mental 
stimulation. And soon these types of behav- 
ioural intervention might be complemented 
by pharmaceuticals. “We may learn the 
molecular mechanisms important for resil- 
ience,’ says O'Donnell. “And then use that to 
help those susceptible to stress or who have 
suffered ill treatment or trauma.” » 


Anthony King is a freelance science writer 
based in Dublin. 
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Progress and challenges in probing 


the human brain 


Russell A. Poldrack' & Martha J. Farah? 


Perhaps one of the greatest scientific challenges is to understand the human brain. Here we review current methods in 
human neuroscience, highlighting the ways that they have been used to study the neural bases of the human mind. We 
begin with a consideration of different levels of description relevant to human neuroscience, from molecules to 
large-scale networks, and then review the methods that probe these levels and the ability of these methods to test 
hypotheses about causal mechanisms. Functional MRI is considered in particular detail, as it has been responsible for 
much of the recent growth of human neuroscience research. We briefly review its inferential strengths and weaknesses 
and present examples of new analytic approaches that allow inferences beyond simple localization of psychological 
processes. Finally, we review the prospects for real-world applications and new scientific challenges for human 


neuroscience. 


constrained by the methods available to study it. Studies of 

patients with focal brain lesions in the nineteenth century led 
to the view of the brain as a collection of focal centres specialized for 
particular cognitive abilties, such as ‘Broca’s area’ for speech production. 
The development of neurophysiological recording techniques in the 
twentieth century led to Barlow’s ‘neuron doctrine’, according to which 
the functions of individual neurons can be extrapolated to explain the 
function of the brain as a whole. The cognitive neuroimaging studies of 
the 1980s focused on subtractive comparisons between cognitive tasks 
meant to isolate specific cognitive operations, and led to a relatively 
modular view of brain function as involving localized and separable 
regions that implement elementary mental operations. 

The methods of contemporary human neuroscience have provided a 
much more complex and nuanced view of the human brain as a dynamic 
network with multiple levels of organization, in which function is char- 
acterized by a balance of regional specialization and network integ- 
ration. Although current methods are limited in their utility for 
studying brain function at fine-grained levels of organization (such as 
single neurons or cortical columns), human neuroscience has nonethe- 
less made remarkable progress in understanding basic aspects of func- 
tional organization, and with this have come a number of applications to 
address real-world problems. Our goal here is to review the current state 
of human neuroscience, focusing on what kinds of questions can and 
cannot be answered using current techniques and how those answers are 
relevant to real-world applications. 


al: he way that we conceptualize brain function has always been 


How can we study the human brain? 

Methods for studying human brain function can be organized according 
to the kinds of mechanistic insights that each technique provides. As 
shown in Table 1 the first characteristic is the level of mechanism cap- 
tured by the method. Mechanisms range from the molecular level (neu- 
rotransmitters and receptors) to large-scale networks (the dynamic 
integration and coordination of different functional areas of the brain). 
Although this distinction is related to physical scale, it does not depend 
on the method’s spatial resolution per se. For example, positron emis- 
sion tomography (PET) using neurotransmitter ligands measures 
molecular mechanisms, even though its spatial resolution is on the order 


of one centimetre. The second characteristic is the ability of each method 
to elucidate the mechanistic role of an observed brain molecule, cell, 
region or network in a mental function of interest. By mechanism we 
mean the causal chain of events that result in the realization of a func- 
tion. To fully understand human brain function is to know the causal 
chains of events at the molecular, cellular, population, and network 
levels that give rise to psychological function. For this reason, the power 
to identify causal relationships is a crucial dimension of difference 
among methods. 

Some methods used in the study of human brain function provide 
relatively little insight into causal mechanisms. This includes methods that 
exploit naturally occurring variation by observing the strength of asso- 
ciation between individual differences in brain function and behaviour. 
Analysis of relationships between behavioural traits, genes, brain structure, 
and brain function exemplify this approach (see Box 1 for a discussion of 
genomic approaches). For many important psychological phenomena, 
from effects of life history to personality traits, we are limited to obser- 
vational methods. For example, individual differences in the personality 
trait of impulsiveness have been associated with differences in striatal 
dopamine release’, functional MRI (fMRI) activation’, and cortical grey 
matter volume’. Observed associations between neural and psychological 
traits do not necessarily imply a causal relationship, as these associations 
could result from an unmeasured third variable that independently influ- 
ences the two measures. Nevertheless, such associations provide a valu- 
able starting point for theorizing about the neural mechanisms of human 
psychology, and their evidentiary value can be strengthened by measuring 
possible confounds to rule them in or out. 

Although functional neuroimaging, electroencelphalography/mag- 
netoencelphalography (EEG/MEG) and single-cell recordings are some- 
times criticized as being purely correlative and therefore uninformative 
about mechanism, that criticism is only partly accurate. When psycho- 
logical processes are experimentally manipulated by presenting a certain 
kind of stimulus and/or engaging the subject in a task, we can infer 
that any reliably elicited brain activity was caused by performing these 
psychological functions. We cannot, however, infer with confidence 
that the observed brain activity is causally responsible for the psycho- 
logical process under study. Despite this limitation (which is shared by 
neuronal recordings in non-human animals), neuroimaging studies 


1Department of Psychology, Stanford University, Stanford, California 94305, USA. ?Center for Neuroscience & Society, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA. 


15 OCTOBER 2015 | VOL 526 | NATURE | 371 


©2015 Macmillan Publishers Limited. All rights reserved 


REVIEW 


Table 1 | An overview of the levels of analysis and levels of causal inference afforded by different human neuroscience methods 


Level of mechanism 


Molecules Cells Populations Networks 
Strength of causal Purely observational Genetic associations with Structural morphometry Resting functional 
evidence (associations do not behaviour, brain function or correlated with psychological connectivity (fMRI, 
necessarily imply causal brain structure traits EEG/MEG) or structural 
relations between mind Postmortem studies of gene connectivity (sMRI, DTI) 
and brain) expression correlated with 
hological trait 
Correlations of MRI spectroscopy PSven ee eanale 
or PET ligand imaging with 
psychological traits 
Manipulate psychological Task modulation studies using Intracerebral Task activation studies Task-based functional 
process and observe brain PET with neurotransmitter recording in (PET, fMRI, EEG/MEG) connectivity (fMRI, 
(neural measures may be ligands or MRI spectroscopy surgical patients Representational analysis EEG/MEG) 


epiphenomenal) 


Manipulate brain and 
observe psychological 
results (demonstrates 
causal effect of neural 
system in behaviour) 


Pharmacological manipulation 
(including hormones) 


Direct brain 
stimulation in 
surgical patients 


(fMRI, EEG/MEG) 
Computational neuroimaging 
(fMRI, EEG/MEG) 

Focal cortical lesions 


Transcranial magnetic 
stimulation 


Transcranial electrical stimulation 


Disconnection/white 
matter lesions 


Cortical surface electrode 
stimulation in surgical patients 


DTI, diffusion tensor imaging; EEG/MEG, electroencephalography/magnetoencephalography; fMRI, functional MRI; MRI, magnetic resonance imaging; PET, positron emission tomography; sMRI, structural MRI. 


in which psychological processes are manipulated comprise the 
majority of current human neuroscience research, and have advanced 
our understanding of human brain function, as we will discuss in more 
detail below. 


BOX | 
Challenges of merging 
neuroimaging and genomics 


The substantial heritability of many psychological functions has driven 
great interest in finding genetic underpinnings of individual 
differences in neural function. Twin and family studies have 
demonstrated significant heritability for both task-related BOLD 
responses” and resting-state functional connectivity? in fMRI. In the 
past decade, a large number of studies have also reported associations 
between BOLD signals and common variants in candidate genes. 
Unfortunately, this approach has generally been unsuccessful in 
identifying genetic associations that are replicated in genome-wi 
association studies (GWAS). For example, a striking finding from 
first well-powered GWAS of genetic variants associated with brain 
volume was that none of the associations previously identified through 
candidate gene studies were replicated at the genome-wide level®. 
Similarly, candidate gene associations with cognitive function (Such as 
the association between polymorphisms in the COMT gene and 
working memory) and brain activation have generally not been 
confirmed in meta-analyses, and are subject to a substantial degree of 
publication bias®**, Like for many other areas of genetics, this 
suggests that genome-wide approaches are the most likely to lead to 
reliable identification of common variants related to brain structure 
and function. However, GWAS approaches require large samples (in 
the tens of thousands) which are very difficult to amass for task-based 
fMRI studies; for that reason, GWAS-based approaches to probing the 
human brain will likely be limited to resting-state fMRI and structural 
MRI. Other strategies, such as targeted studies investigating rare 
variants of large effect identified using genome sequencing or studies 
using gene expression in peripheral tissues may have greater utility for 
genetic studies of task-based fMRI. Task-based fMRI may also be used 
to further investigate candidate variants identified on the basis of 
GWAS studies of psychiatric disorders or population variability. 
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More decisive evidence concerning causal necessity can be obtained 
by manipulating the brain itself to assess the resulting effect on the 
psychological process in question. Naturally occurring or surgical 
lesions, which provided the basis for most of what we knew about 
human brain function before the advent of neuroimaging, are still of 
great interest because they provide insight into the causal necessity of 
specific brain regions or connections. More recently developed methods 
of brain stimulation allow for reversible inhibition or excitation of a 
brain area, thereby expanding our ability to test the causal role of brain 
regions in the mechanisms of human thought and action. Deep brain 
stimulation (DBS) provides the most precise method for targeted stimu- 
lation by using surgically implanted electrodes, but is limited to situa- 
tions where patients are undergoing implantation for medical reasons. 
Use of non-invasive brain stimulation for research purposes has grown 
rapidly in recent decades, starting with transcranial magnetic stimu- 
lation (TMS), in which pulsed magnetic fields induce currents in the 
brain. Various forms of transcranial electric stimulation (TES), in which 
current is delivered using external electrodes, have also been used, of 
which the most common variant is transcranial direct current stimu- 
lation (tDCS). Unlike DBS, non-invasive brain stimulation generally 
affects larger and more superficial areas of the brain, but researchers 
are seeking to improve spatial resolution with new magnetic coil shapes 
for TMS and new electrode configurations for tDCS. Focused ultra- 
sound is also being explored as a means to stimulate more precisely 
delimited brain regions’. Pharmacological agonists and antagonists of 
particular neurotransmitter systems can be used to experimentally 
manipulate the human brain at the molecular level, although with 
imperfect specificity’. By combining each of these manipulations of 
brain function with functional brain imaging, one can leverage the cau- 
sal information obtained through pharmacological challenges or brain 
stimulation. For example, the causal role of activity in specific brain 
regions, identified using fMRI, for a particular function has been tested 
by brain stimulation, using both direct cortical stimulation (for example, 
ref. 6) and TMS’. 


New capabilities of {MRI 


Because fMRI has become the main method for the study of human 
brain function, our review focuses on this method and new ways of using 
it. In the last two decades, fMRI has transitioned from a newly developed 
technique for revealing neuronal activity to being the workhorse method 
of cognitive neuroscience (see the recent special issue of Neuroimage on 
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the first 20 years of f{MRI*). Much has been learned about the biological 
mechanisms underlying blood oxygen level dependent (BOLD) sig- 
nals*'°, but still much remains to be understood, such as the roles of 
specific glial and neuronal cell types in the coupling of neuronal activity 
to blood flow (for example, refs 11, 12). This limited physiological 
understanding poses problems for the interpretion of fMRI data. In 
particular, although fMRI signals often correlate strongly with both 
action potentials (‘spikes’) and local field potentials, they are largely 
reflective of post-synaptic processes, and in some cases they can be 
dissociated from spiking altogether'’. The relative sensitivity of {MRI 
to post-synaptic processes as opposed to spiking has been seen as a 
drawback by some who view spikes as the essence of brain function, 
but it is worth noting that this discovery has actually rekindled interest 
in the analysis of local field potentials in electrophysiology (where these 
signals have long been discarded) (for example, ref. 14), and suggests 
that {MRI may sometimes be sensitive to subthreshold signals that 
would be missed by analysis of spikes only. Uncertainties in relating 
fMRI to psychological, as well as physiological, processes have also been 
debated, and progress has been made on this front too. From experi- 
mental approaches such as adaptation paradigms for probing represen- 
tations to analyses of functional connectivity, {MRI is routinely used to 
answer questions about mind-brain relationships that go far beyond 
localization’’. Here we discuss three examples of new approaches to 
understanding human brain function with fMRI that address questions 
of representation, computational processes and network interactions 
across the brain. 


Representational analyses 

Early work in neuroimaging focused largely on ‘brain mapping’— 
identifying regions based on the mental processes that cause them to 
be activated. This approach has provided a large body of reliable asso- 
ciations between function and structure, but has not been particularly 
successful in providing new insights into how psychological functions 
are implemented’*®. However, two relatively recent approaches, known 
as multi-voxel pattern analysis (MVPA)”’ and representational similar- 
ity analysis (RSA)'*, can more directly relate psychological contents to 
brain function (Fig. 1). MVPA involves the use of methods from the 
field of machine learning to decode or predict psychological states 
from patterns of brain activation across voxels (hence the term ‘brain- 
reading’). Since its introduction more than a decade ago, MVPA 
has been used in a number of domains to demonstrate the predictive 
ability of {MRI activation patterns. Perhaps the most impressive are 
demonstrations of the ability to successful reconstruct visual scenes'” 
and faces” from BOLD activity patterns; similar advances have been 
made for higher cognitive functions such as word meaning”. These 
studies go beyond simply differentiating between experimental condi- 
tions, as they show how the underlying representational spaces relate to 
brain activity; for example, using a related approach known as voxel- 
wise modelling, Huth and colleagues” developed a model that esti- 
mated the response at each location on the cortical surface to a large 
number of visual and semantic features present in natural movies 
(Fig. 2). MVPA approaches have also provided new insights into the 
neural organization of cognitive functions. For example, MVPA has 
informed our understanding of the mechanisms of visual attention, by 
showing that attention changes both the representation of stimuli 
across regions of visual cortex as well as the mutual information 
between regions”. In the domain of memory, MVPA has been used 
to show that competition between memory representations in working 
memory leads to poorer subsequent memory for those items, dem- 
onstrating a nonmonotonic relationship between competition and sub- 
sequent memory”. 

Whereas MVPA is generally used to decode individual psychological 
states, RSA instead asks how the patterns of brain activity evoked by 
different stimuli are related to one another, and thus provides the 
means to directly address questions of how mental representations 
are implemented in the brain. RSA has enabled the demonstration of 
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Figure 1 | Different approaches to the analysis of {MRI data. This example 
depicts data from a hypothetical study in which four different stimuli were 
presented (two birds and two items of furniture) and response measured for 
each item across nine voxels; intensity of activity is depicted from blue 
(negative) to red (positive). The standard univariate {MRI analysis approach 
would examine the difference at each voxel between the averages of the two 
categories. Multi-voxel pattern analysis (MVPA) examines the 
multidimensional relationship between patterns of activity, in this case 
projecting the nine-dimensional space of voxel patterns (the voxel vector) into a 
two-dimensional space and identifying a boundary that separates items from 
the two classes. Representational similarity analysis (RSA) examines the 
correlations between activity patterns for each item, in this case showing that 
items within category show a high correlation (red), whereas the correlation of 
items between categories is low (blue). 


direct isomorphisms between psychological representations of stimuli 
(such as the similarity or typicality of objects) and the neural patterns 
associated with those stimuli’*’®. Because psychological theories often 
make predictions regarding the similarity of different stimuli, RSA has 
also enabled the direct testing of theories, such as theories about how 
categories are represented’’ and theories of how repeated experiences 
lead to enhanced learning™*. RSA can be applied to any kind of multi- 
dimensional data, and this has enabled the demonstration of systematic 
mappings of visual object representations between humans (using 
fMRI) and non-human primates (using electrophysiological record- 
ings)”’—an example that highlights how human neuroscience can also 
help to establish more direct parallels with findings in non-human 
models, allowing insights to filter in both directions. 

Although much MVPA and RSA work (as depicted in Fig. 1) has 
focused on the representations found in localized brain regions, these 
methods are equally useful for assessing representations that are spread 
across the brain. For example, recent work has shown that mental states 
such as physical pain can be decoded by analysis of patterns of activation 
across brain regions”. 

The legitimate enthusiasm about these methods is tempered by lin- 
gering questions regarding the interpretation of multivariate ana- 
lyses***. In addition, recent work combining electrophysiology and 
fMRI in non-human primates has demonstrated that the sensitivity of 
MVPA is limited by the spatial characteristics of the neuronal represen- 
tations that code for particular features, such that some kinds of neur- 
onal patterns may be more difficult to decode using MVPA than 
others*’. Finally, it is important to stress that, like standard neuroima- 
ging approaches, MVPA and RSA approaches do not inform about 
causal mechanisms. 
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Figure 2 | A mapping of high-dimensional semantic space onto the cortical 
surface. Here, voxel patterns for 1,705 different action and object categories, 
based on brain activity obtained during viewing of natural movies” are mapped 
onto the cortical surface image generated using online browser at (http:// 
gallantlab.org/semanticmovies/). a, Mapping of semantic categories to each 
point on the surface; the colours on the surface map correspond to the semantic 
map in panel b. b, A depiction of the semantic space derived from all 
semantically selective voxels. Categories that have similar colours in the 
semantic space are represented in similar patterns of voxels in the brain. Data 
from ref. 22. 


Integrating {MRI and computational modelling 

Computational models play a central role in our understanding of both 
cognitive and brain functions and, increasingly, of the relationship 
between the two. By making assumptions explicit, computational mod- 
els enable more direct testing of theories, as well as providing the means 
to link computations at the neuronal level with higher-order functions. 
An example of an area in which substantial progress has been made 
using this approach is reinforcement learning, in which an animal 
selects actions and learns from the rewards gained from those actions. 
Computational models of reinforcement learning (RL) have long played 
a central role in artificial intelligence and psychology, and the discovery 
by Schultz and colleagues** that dopamine neurons appear to signal one 
of the important quantities in these models (reward prediction error) 
has brought these models to the forefront of the neuroscience of decision 
making. For example, a set of publications in 2003 applied RL models to 
neuroimaging data and thereby identified correlates of reward predic- 
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tion error signals in dopaminergic target regions such as the ventral 
striatum**”°. Subsequent neuroimaging work has established that there 
are multiple RL signals in the brain, some reflecting the simple asso- 
ciation between actions and values (known as ‘model-free’ RL) and 
others reflecting more complex contextual and hierarchical learning 
processes (known as ‘model-based’ RL)*”’*. Similarly, in the study of 
memory, progress has been made in the mapping of medial temporal 
lobe subregions to specific computational operations such as pattern 
completion and pattern separation (for example, ref. 39). In each of 
these domains, the computational interpretation of neuroimaging sig- 
nals has been greatly enhanced by parallel studies in non-human ani- 
mals, allowing imaging signals to be linked more directly to direct 
measures of neuronal activity. 


Functional connectivity analysis and resting-state {MRI 
Perhaps the most revolutionary development to arise from human neu- 
roimaging research is the realization that the resting brain is far from 
quiescent, and that important insights into brain function can be gained 
from studying the correlated fluctuations of signals across the brain at rest. 
Much of the research into the resting state has focused on a set of regions 
(including anterior and posterior midline regions, lateral temporoparietal 
cortex, and the medial temporal lobe, known as the ‘default mode’ net- 
work") that are consistently less active during performance of difficult 
tasks", and are functionally connected in the resting state*’. Similar pat- 
terns of resting connectivity have been observed in non-human primates* 
and awake rodents“, suggesting that they reflect fundamental principles 
of mammalian brain organization. There is also growing evidence that 
these networks may be important in brain disorders. For example, the 
posterior portion of the default mode network appears to play a critical 
role in the memory deficits observed in Alzheimer disease, showing a 
convergence of amyloid deposition, structural atrophy, and decreased 
metabolic activity”. 

Data collected in the resting state can provide insights into the broader 
functional organization of the brain as well. In particular, the organization 
of resting state signals bears a close relation to the organization of brain 
activity evoked by mental tasks. For example, Smith et al.*° used independ- 
ent component analysis to identify spatially independent sets of voxels 
from resting-state fMRI data and from task-based data (obtained from 
the Brainmap meta-analytic database), and demonstrated that the compo- 
nents extracted from resting-state fMRI showed a high degree of concord- 
ance with those extracted from task-based data. The overlap between 
resting-state and task-based functional organization can also be seen 
within individuals; for example, the longitudinal examination of a single 
individual revealed reliable spatial parcellation of activity in the cerebral 
cortex (using resting {MRI data) that mapped systematically to the activa- 
tion patterns observed across a large number of task measurements”. 

Despite the substantial excitement around resting-state {MRI find- 
ings, numerous concerns have been raised about their interpretation. 
In particular, there are lingering questions regarding the ways in 
which artefacts related to head motion and physiological fluctuations 
may influence estimates of resting state connectivity, and whether 
common data analytic methods may induce systematic artefacts’. 
In addition, potential confounds such as light sleep*° may drive dif- 
ferences in resting state signals. The unconstrained nature of resting- 
state {MRI is a double-edged sword; it is potentially very useful for the 
study of clinical groups for whom task performance may be difficult, 
but at the same time, it is not possible to determine whether group 
differences reflect fundamental differences in functional connectivity 
or relative differences in the ongoing mental content of different 
groups during rest (see ref. 51). 


Applications of human neuroscience 

With the development of new methods have come attempts to apply 
them to real-world problems, in both medical and non-medical con- 
texts. (See Box 2 for a discussion of the ethical, legal, and societal issues 
raised by these applications.) 
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BOX 2 
Ethical, legal and societal impact of 
human neuroscience 


As the methods of human neuroscience find broader application, they 
affect human life in new ways. The field of ‘neuroethics’ is concerned 
with ethical, legal and societal issues raised by these new 
applications®®. 

Two kinds of problems have emerged from the increasing ability of 
brain imaging to reveal aspects of individual psychology: problems 
that arise from the current and imminent capabilities of these 
methods, and problems that arise from their lack of claimed 
capabilities. To the extent that imaging can predict important personal 
characteristics such as health status, academic achievement, and 
criminal behaviour, its use must be managed with care to protect 
privacy and avoid discrimination”. To the extent that imaging cannot 
provide help with high-stake problems, the public should be protected 
from claims that it can. For example, a seemingly ‘scientific’ method 
for detecting lies or diagnosing psychiatric disorders®°°” has a strong 
appeal to the general public who cannot be expected to appreciate the 
gap between what is claimed and what is established fact. 

New ways of changing brain function pharmaceutically, and with 
electromagnetic stimulation, also raise new ethical issues. Of course, 
humanity has long manipulated brain function to modify mental 
states using substances such as alcohol and caffeine. However, 
psychopharmacology has broadly penetrated our everyday lives and 
the scope of psychiatric diagnoses and treatment has expanded—a 
societal shift that some find troubling?®. Furthermore, many now use 
psychoactive drugs purely for enhancement of healthy brain function 
rather than to treat a medical condition®’. Aside from issues of safety 
and efficacy, brain enhancement raises issues of fairness (is it akin to 
doping in sports?), justice (will the ability to access enhancements 
widen the already existing gaps between haves and have-nots?) and 
social standards (will unenhanced job performance become sub- 
standard?). 

Non-invasive brain stimulation is the newest method for brain 
enhancement. Simple transcranial electrical stimulation (for example, 
tDCS) devices are available to consumers at relatively low cost and 
regulation is minimal?©. Given the public interest in this method and 
the rudimentary state of knowledge about its effects, it is crucial that 
the safety and efficiacy of these methods are established. The efficacy 
of cognitive enhancement with tDCS is hotly debated!°! and whether 
long-term use of tDCS is safe has yet to be studied. In addition, 
neuroethical issues of fairness, justice and social standards mentioned 
above also apply to enhancement of brain function by brain 
stimulation. 


Brain disorders 
The methods of human neuroscience hold particular promise for under- 
standing and treating psychiatric disorders, because these disorders do 
not have clear analogues in non-human animals, and animal models 
currently used for preclinical screening of potential therapies are 
increasingly regarded as being inadequate”. In the absence of valid 
animal models, it becomes all the more crucial to apply new methods 
for understanding human brain function and dysfunction. The goal of 
improving the treatment of neuropsychiatric disorders is made even 
more challenging because of our current diagnostic system. Although 
depression, schizophrenia, autism and other serious psychiatric disor- 
ders have long been considered disorders of the brain, they are still 
diagnosed exclusively by behavioural signs and symptoms. These dia- 
gnostic criteria do not seem to have clear relations to the biological 
processes that would be targeted by new medical treatments. 

In response to this problem, an alternative way of systematizing psy- 
chiatric disorders has been developed—the NIMH Research Domain 
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Criteria (RDoC)**—that describes disorders according to impairments 
in specific functional systems of the brain (such as fear or reward learn- 
ing) and at different levels of mechanism of the kind represented in 
Table 1 (for example, molecules or circuits). RDoC characterizations 
cut across traditional diagnostic categories and are intended to capture 
the underlying pathophysiology more accurately. Given the multiple 
levels of mechanism captured by the RDoC, the system encourages 
research with a broad array of methods to identify potentially targetable 
dysfunctions. 

The application of several human neuroscience methods has led to 
the development of targeted treatments, for example, in the field of 
depression. Functional imaging studies have highlighted the role of 
the subgenual anterior cingulate cortex in a network of regions involved 
in mood, leading Mayberg and colleagues to use deep brain stimulation 
in this area to regulate mood in depressed patients”. Lateral prefrontal 
regions, implicated through imaging studies in depression, have been 
targeted with non-invasive brain stimulation, including the FDA- 
approved use of TMS for treatment-resistant depression. Functional 
neuroimaging can itself be used as a treatment, by providing patients 
with a real-time measure of regional brain activity to use as a biofeed- 
back signal. This approach is being tested for the treatment of chronic 
pain, depression and addiction”. In contrast, neuroimaging has not so 
far been very successful in aiding differential diagnosis of disorders in 
terms of current diagnostic categories. A recent large meta-analysis 
identified a set of regions in which structural abnormalities were con- 
sistently associated with psychiatric disorders, but found very little spe- 
cificity for individual disorders”, consistent with the notion that current 
diagnostic distinctions are not biologically realistic categories. 

Another approach to the discovery of therapeutic targets is the use of 
genetic association studies to identify sets of genes that are associated 
with a disorder and that together may indicate particular molecular 
pathways underlying the disorder. Although the numbers of subjects 
needed to establish reliable genetic associations is daunting, progress has 
been made through large international collaborations. For example, 
Psychiatric Genomics Consortium has to date identified more than 
100 common genetic variants reliably implicated in schizophrenia”. 
Imaging can also be used to develop endophenotypes (or intermediate 
phenotypes) that may bear a closer relation to the effect of a gene variant 
than does disease diagnosis, as well as to mitigate the problem of het- 
erogeneity within conventional diagnostic categories (see Box 1). 

It may be less surprising that the methods developed for human 
neuroscience research have been applied in the diagnosis and treatment 
of neurological diseases, but at least two recent developments deserve 
mention here. Studies of Alzheimer disease at mechanistic levels from 
molecules to systems have improved diagnostic accuracy and have 
enabled a degree of prediction before clinical signs of the disease*. 
Molecular biomarkers from blood and CSF, and patterns of brain activ- 
ity and structure have revolutionized clinical research in this area by 
facilitating trials of preventive treatment and by providing intermediate 
phenotypes as early gauges of effectiveness. Disorders of consciousness 
following severe brain damage are another area of clinical neuroscience 
for which neuroimaging shows promise. Some patients who have been 
diagnosed as being in the vegetative state can follow commands to 
imagine actions that activate specific areas of the brain in much the 
same way as healthy control subjects do, and can even use these ima- 
gined actions to answer questions (for example, “Do you have any 
brothers? If yes, imagine playing tennis, if no, imagine walking through 
your house.”)*’. Thus, neuroimaging offers new insights into the assess- 
ment of consciousness, as well as the distinct problem of prognosis, in 
severely brain-damaged patients. 


Predicting behaviour 

The ability to predict future behaviour is of value in almost every sphere 
of human activity. Although it has often been said that “the best pre- 
dictor of future behaviour is past behaviour,” in some cases brain 
imaging can improve our ability to predict future behaviour, over and 
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above what we can do with behavioural history. Marketing professionals 
were among the first to attempt to predict behaviour using brain 
imaging. Recognizing the limitations of focus groups and other tra- 
ditional methods to discern what consumers want, they have used func- 
tional neuroimaging to predict the effects of different advertising 
campaigns, packaging, and other factors on consumer behaviour, based 
on the premise that activity in the brain’s reward or motivation centres 
may be a more direct measurement of wanting than are verbal self- 
reports’. Although most of this work is conducted by and for corpora- 
tions aiming to improve sales rather than share scientific knowledge, 
published academic studies have begun to lend some credence to the 
potential of neuromarketing. For example, when teenage subjects were 
scanned while listening to unfamiliar songs, the reward system activity 
evoked by the songs, but not the subjects’ ratings of their likeability, was 
predictive of sales of the songs over the subsequent three years’. 

Prediction is also important outside of business. Falk and colleagues 
have adapted neuromarketing methods for the purpose of creating 
more effective public service announcements. They showed that brain 
responses (but not ratings) to an anti-smoking advertisement were 
predictive of subsequent call volume to an anti-smoking hotline. 
Gabrieli et al.°° recently summarized evidence concerning neuroima- 
ging-based prediction in domains ranging from healthful eating to 
criminal recidivism, including numerous examples of prediction of 
educational outcomes. Indeed, neuroimaging can predict future aca- 
demic skills over and above traditional behavioural predictors, thus 
enabling earlier and more appropriate interventions to address indi- 
vidual children’s reading and math difficulties. These authors also 
pointed out a number of methodological challenges in neuroima- 
ging-based prediction of behaviour, including the need to develop 
and test predictions with different samples, to avoid the ‘optimism 
bias’ that occurs when predictions are tested in the same population 
from which they were generated. 


Human neuroscience in the courtroom 

In recent years the methods of human neuroscience have found their 
way into the courtroom. Perhaps the most obvious, but also the most 
misunderstood, role for neuroscience is in helping to determine criminal 
responsibility. Proving that a criminal act may have had a neural cause is 
not in itself exculpatory, as every human act is caused by the brain™. 
However, to the extent that neuroscience can provide evidence of mental 
dysfunction (for example, a tumour in the frontal cortex that may have 
impaired the ability to control behaviour), immaturity or other psycho- 
logical grounds for reduced criminal responsibility, it is potentially rel- 
evant and has been used. For example, the Supreme Court explicitly 
cited neuroscience evidence in its decision in Graham v. Florida to 
abolish life in prison without parole for juveniles who commit non- 
homicidal offences. It is more difficult to make legal arguments for 
applying neuroimaging evidence to individual cases because most find- 
ings from neuroimaging research are generalizations based on groups of 
people and may therefore not allow reliable inferences regarding indi- 
viduals®. Nevertheless, neuroimaging scans from defendants are some- 
times presented in the sentencing phase of criminal trials as grounds for 
mitigation of the sentence, as weaker evidentiary standards apply in the 
sentencing phase. 

Neuroimaging can be applied in ways other than determining degree 
of responsibility. Lie detection is one example that has been pursued in 
legal contexts, although it has not so far been admitted into US courts 
and has yet to demonstrate validity, reliability or resistance to counter- 
measures outside of the laboratory®*. Another application concerns 
pain: brain-based biomarkers for pain would help discriminate real 
suffering from malingering—a pivotal issue in many lawsuits—and have 
been admitted as evidence in at least one US case". 


Challenges and future directions for neuroimaging 


The field of neuroimaging is growing rapidly, and there are a number of 
exciting new directions on the horizon. 
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New technologies for imaging and manipulating the 
human brain 

Rapid advances in non-human neuroscience have been driven by the 
development of technologies that measure and manipulate brain func- 
tion with increasing precision. Human neuroscience has lagged in this 
respect, in part because of the ethical challenges associated with direct 
manipulation and neuronal recording of the human brain. However, in 
response to the urgent need for better treatments for psychiatric dis- 
orders, research is underway with the aim to design implantable systems 
for sensing and modulating human brain networks®. The development 
of optogenetic and “opto-fMRI approaches in non-human primates” 
suggests that these methods may one day become feasible for use in 
human studies, and it is likely that electrical brain stimulation will 
eventually be supplemented with optogenetic approaches. Although 
such invasive techniques will likely only be used in rare clinical cases 
(that is, patients are undergoing implantation for medical reasons), 
they have the potential to provide much greater specificity in circuit 
mapping. 

fMRI will probably remain the principal neuroimaging method in 
humans in the foreseeable future. However, the ongoing BRAIN initiat- 
ive in the United States” is providing substantial funding to develop 
entirely new techniques for imaging of brain function, and a significant 
proportion of this funding will go specifically towards the development 
of new methods for imaging the human brain. In addition, new devel- 
opments in MRI have greatly increased the utility of standard MRI 
systems. For example, multiband imaging techniques’”' have enabled a 
several-fold increase in the temporal resolution of fMRI acquisitions, 
and higher MRI field strengths (7 tesla and higher) hold promise 
to enable improvements in spatial resolution as well (for example, 
ref. 72). There is thus great reason to be optimistic that methodological 
limits will continue to be pushed in the future. 

Additional insight into human brain function will likely come from 
the study of postmortem human brains, which has long been a staple 
method for the characterization of anatomical structure and study of 
brain disorders. New techniques have enhanced the ability to visualize 
the structure of human brain tissue (Fig. 3). For example, optical coher- 
ence tomography has been used to image ex vivo human cortical tissue, 
providing high-resolution imaging of cytoarchitecture with less distor- 
tion than standard microscopy techniques”*. The first whole-brain atlas 
of genome-wide gene expression in postmortem human brains” has 
provided an important resource for understanding how gene expression 
relates to brain function; for example, the maps from this project have 


Figure 3 | New methods for characterizing the postmortem human brain. 
a, A map of expression of the serotonin receptor 3B displayed on the 
reconstructed cortical surface in one individual from the Allen Brain Atlas 
Human Brain data set (generated using data from http://human.brain- 
map.org/). b, Optical coherence tomography imaging of the human brain 
(2.9 in-plane resolution). Large panel presents an average intensity projection 
in depth over 300; inset zooms are maximum intensity projections over 300, 
showing fibres in the white matter (pink inset), fibres arcing through the 
subcortical junction to insert into the cortex (cyan inset), and neurons in the 
cortex (bright spots in the green inset). Image courtesy of Bruce Fischl, Caroline 
Magnain and David Boas, Massachusetts General Hospital. 
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been used to identify expression differences across different resting-state 
networks”. Continued development of such resources will be essential 
for progress in understanding the genetic architecture of brain function 
and their relation to mental health disorders. 


Connectomics 

The Human Connectome Project’® is nearing completion, and has 
already provided a rich database for the modelling of functional and 
anatomical connectivity of the human brain. However, fundamental 
challenges remain. For example, diffusion MRI provides the means to 
track white matter pathways (Fig. 4) and has been used to identify white 
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Figure 4| A ‘connectogram”™ for an example healthy adult female subject. 
The outermost ring shows the various brain regions arranged by lobe (fr, 
frontal; ins, insula; lim, limbic; tem, temporal; par, parietal; occ, occipital; nc, 
non-cortical; bs, brain stem; CeB, cerebellum) and further ordered anterior 
(top) to posterior (bottom). The colour map of each region is lobe-specific and 
maps to the colour of each regional parcellation as determined using FreeSurfer. 
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matter connectivity disruptions associated with cognitive disorders such 
as dyslexia’; however, diffusion imaging has inherent biases that limit 
its ability to accurately track connections across the entire brain”. The 
last decade has seen a proliferation of approaches to model functional 
connectivity on the basis of functional MRI data, though the dust has yet 
to settle regarding which methods are most effective (for example, 
ref. 80). To determine this, the analysis methods must be validated, 
which is challenging to do in humans but may be achieved using direct 
measurements of functional connectivity from invasive human 
approaches and non-human animals to validate the neuroimaging 
results. There is increasing evidence that at least in non-human primates 
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The set of five rings (from the outside inward) reflect grey matter volume, area, 
thickness, curvature, and connectivity density. The lines inside of the circle 
represent the computed degrees of connectivity between segmented brain 
regions using diffusion tractography, with colour representing the relative 
fractional anisotropy of the connection (from blue to red). Image courtesy of 
Jack Van Horn, University of Southern California. 
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functional connectivity reflects anatomical connectivity as measured 
using either diffusion MRI*! or anatomical tract-tracing™; but it remains 
an important challenge to establish the ways in which functional and 
diffusion connectivity measures converge or diverge. 


Reproducibility of neuroimaging research 

Large-scale meta-analyses have made it clear that neuroimaging results 
can be highly convergent across studies, to the degree that cognitive 
processes can be accurately inferred from individual subject data using 
decoders trained on meta-analytic data based on reported activation 
coordinates*’. However, the last few years have also seen increasing con- 
cern regarding the reproducibility of research findings in neuroscience, 
paralleling more general concerns about reproducibility of scientific 
results**. These issues are particularly acute for neuroimaging given the 
high dimensionality of the data, relatively low statistical power of many 
studies*’, high degree of analytic flexibility in data analysis procedures*®, 
and potential for questionable research practices such as circular 
analysis procedures*’. The field of neuroimaging has been at the forefront 
of a number of developments that aim to improve reproducibility and the 
sharing of data are increasingly being embraced. The Alzheimer’s Disease 
Neuroimaging Initiative (ADNI), International Neuroimaging Data 
Sharing Initiative (INDI), ENIGMA, and the Human Connectome 
Project together have shared thousands of neuroimaging data sets and 
this has enabled a number of novel discoveries. For example, data sharing 
by the ENIGMA consortium has enabled the first well-powered genome- 
wide association study of brain volume™, identifying replicated associa- 
tions between brain volume and several common genetic variants. In 
addition, nearly all of the main software packages for neuroimaging data 
analysis are free and open source, providing transparency and repro- 
ducibility in data analysis across groups, and the publication of fully 
reproducible analysis workflows has begun (for example, ref. 89). The 
increasing use of machine learning methods, with their focus on out-of- 
sample generalization rather than statistical significance, is also leading to 
a greater emphasis on achieving reproducibility. 


Outlook 


The use of new tools for imaging and manipulating the brain will con- 
tinue to advance our understanding of how the human brain gives rise to 
thought and action. The combination of myriad methods with different 
and complementary strengths and weaknesses will allow neuroscientists 
to develop a multilevel understanding of the brain, spanning from mole- 
cules to large-scale networks. New analysis methods have advanced 
fMRI beyond ‘blobology’ and will provide direct insight into the map- 
ping of mental and neural representations, while newer analysis and 
acquisition methods will offer other novel insights into the relation of 
mind and brain. fMRI and other human neuroscience methods will 
continue being applied to solve real-world problems, within medicine 
and beyond. Although some of these applications are currently pre- 
mature relative to the demonstrated capabilities of the methods, it is 
clear that the new methods of human neuroscience will have much to 
offer science and society. 
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Nutrition and brain aging: how can we move ahead? 


P Barberger-Gateau'” 


Epidemiological studies and basic research suggest a protective effect of long-chain omega-3 polyunsaturated fatty acids, 
antioxidants and B vitamins against brain aging. However, most randomized controlled trial (RCTs) with nutritional supplements 
have yielded disappointing effects on cognition so far. This paper suggests some original directions for future research to better 
support a role of nutrition in brain aging. The role of other nutrients such as docosapentaenoic acid and fat-soluble vitamins D and 
K should be investigated. A more holistic approach of nutrition is necessary, encompassing potential synergies between nutrients 
as found in a balanced diet. Potential beneficiaries of a nutritional supplementation should be better targeted, according to their 
dietary, cognitive and maybe genetic characteristics. Innovative RCTs should be implemented to assess the impact of nutrition for 
the prevention or treatment of cognitive decline in older persons, using intermediate biomarkers of disease progression and 


mechanisms of action of nutrients as outcomes. 
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INTRODUCTION 


In accordance with the results of basic research, epidemiological 
studies suggest a protective effect of several classes of nutrients 
against cognitive decline and risk of dementia,’* including 
principally long-chain omega-3 polyunsaturated fatty acids (n-3 
PUFA),** vitamins C and E,° carotenoids,° polyphenols’ and B 
vitamins,® although some discordant data exist as well. Many 
underlying mechanisms can be evoked to support biological 
plausibility of a protective effect of these nutrients against brain 
aging: role in brain composition and function,’ hippocampal 
neurogenesis,'° limitation of accumulation of the beta-amyloid 
peptide,'''? decreased oxidative stress and_ inflammation,'? 
decreased homocysteine concentration® and vascular effects.'* 

However, most well-conducted randomized controlled trials 
(RCTs) with nutritional supplements have yielded disappointing 
effects on cognition so far (for reviews, see Wald et al.,'° 
Mazereeuw et al.'° and Dangour et al.'’). As written by Dangour 
et al.'’ regarding n-3 PUFA but easily transferable to other 
nutrients, whether discrepancies between epidemiological and 
intervention studies ‘reflect a real absence of benefit on cognitive 
function from (nutritional) supplementation, or whether they 
reflect intrinsic limitations in the design of published studies 
remains open to question’. The aim of this paper was to suggest 
new areas of research to better support a role of nutrition in brain 
aging. Two aspects are developed: nutritional exposure, concerning 
the investigation of new nutrients and a more holistic approach of 
the diet, and innovative RCTs with nutritional supplements, 
regarding their inclusion criteria and outcomes in relation with 
cognition. 


INVESTIGATING NEW NUTRIENTS 


There is convincing evidence that fats are a major class of 
nutrients for brain development, structure and functioning.’ 


However, previous studies have neglected some fatty acids or 
fat-soluble vitamins that could have an important role in the brain. 


Docosapentaenoic acid (DPA) 


Actually, the term DPA refers to two different fatty acids: one in 
the omega3 series C22:5n-3, situated between eicosapentaenoic 
acid (EPA) and docosahexaenoic acid (DHA) in the desaturation 
and elongation chain of alpha-linolenic acid, and another one in 
the omega6 series C22:5n-6 resulting from elongation, desatura- 
tion and beta-oxidation of arachidonic acid. As EPA and DHA, n-3 
DPA is mainly found in fatty fish and could contribute to the 
apparently protective effect of fish consumption against cognitive 
decline. The metabolism and biological functions of DPA are still 
poorly understood. Supplementation with pure n-3 DPA induced a 
significant rise in the proportions of EPA and DHA in plasma 
triacylglycerides, suggesting that DPA may act as a reservoir of the 
major long-chain n-3 PUFA in humans.'® In healthy humans under 
their spontaneous diet, n-3 DPA is found in non-negligible 
amounts in cerebrospinal fluid (CSF) and positively correlated 
with total plasma n-3 PUFA.'? 

N-6 DPA in erythrocytes was higher in 48 older individuals with 
mild cognitive impairment (MCI) than in 27 healthy controls in a 
cross-sectional analysis carried at baseline of an RCT.7° Higher n-6 
DPA also correlated with poorer mental health and_ less 
satisfaction with life, while lower n-3 DPA was associated with a 
higher frequency of self-reported body pain. Data from this small 
study with multiple statistical testing should be reproduced in 
larger, longitudinal studies before drawing firm conclusions. In the 
Three-City cohort study, plasma n-3 DPA was not associated with 
risk of incident dementia, and n-6 DPA was not available.?' 


Vitamin D 


Beyond its role in bone health, the multiple biological functions of 
vitamin D are raising increasing interest.?* In humans, vitamin D 
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comes from both synthesis during exposure to sunlight and food 
sources, mainly fatty fish as for EPA and DHA. Thus it is difficult to 
disentangle their respective effects in observational studies. The 
prevalence of vitamin D deficiency is very high in older adults.? 

Meta-analyses of case-control studies showed that patients 
with Alzheimer’s disease (AD) had lower serum vitamin D 
concentrations than controls,?* with borderline significance after 
adjusting for age.** Longitudinal studies are scarce. Participants in 
the InCHIANTI cohort study who were severely vitamin D deficient 
according to their serum 25-hydroxy-vitamin D levels (< 25 nmol/l) 
had a significantly increased risk of cognitive decline over 6 
years.”> However, this association was not reproduced in another 
longitudinal study.7° Several RCTs assessing the impact of vitamin 
D on cognition, either as primary or a secondary outcome, are in 
progress (clinicaltrials.gov). In the Women’s Health Initiative, 
supplementation with 400IU/day of vitamin D, along with 
calcium, did not result in decreased risk of dementia or cognitive 
decline, compared with the placebo group over a mean follow-up 
of 8 years.”’ If ongoing RCTs yield more positive results with 
higher doses of vitamin D, improvement of cognitive function 
would be a beneficial side-effect of systematic vitamin D 
supplementation in deficient individuals. 


Vitamin K 


A growing body of evidence supports a role for vitamin K in brain 
and cognition.*® Several vitamin K-dependent proteins contribute 
to brain function, notably Gas6 which is expressed in the 
hippocampus of adult rat and contributes to survival of neurons 
and microglia. Moreover, vitamin K is involved in sphingolipid 
metabolism.”® Dietary vitamin K is provided by vegetables, whose 
consumption is inversely associated with cognitive decline and 
risk of dementia.*? Data on the relationship between vitamin K 
status and cognition in humans are lacking. A single cross- 
sectional epidemiological study showed a specific relationship 
between high serum vitamin K (phylloquinone) concentrations 
and better performances in verbal episodic memory in 320 
healthy older adults®° These data need to be confirmed in 
longitudinal studies using serum phylloquinone as a biomarker of 
vitamin K status. 


FROM A SINGLE-NUTRIENT APPROACH TO DIETARY PATTERNS 


Most previous epidemiological studies and RCTs have focused on 
single nutrients, ignoring their potential synergistic effects when 
they are provided in optimal quantities and proportions as in a 
balanced diet. Considering dietary patterns may lead to a more 
holistic approach of the diet.*' Diets rich in fruits, vegetables, 
vegetable oils, legumes, cereals and fish provide folate, vitamins C 
and E, carotenoids, polyphenols and long-chain n-3 PUFA along 
with a low glycemic index that contribute to lower inflammation, 
oxidative stress and homocysteine concentration, improve 
vascular status and insulin sensitivity and maintain brain structure 
and functioning.” Moreover, some combinations of nutrients may 
have synergistic effects that will reinforce their individual proper- 
ties. For example, antioxidant nutrients can protect long-chain n-3 
PUFA from peroxidation to which they are particularly susceptible 
because of their multiple double bonds. 


Folate and n-3 PUFA 


In the Three-City cohort study, we observed that regular fish 
consumption was not protective against risk of dementia if not 
accompanied by regular intake of fruits and vegetables.** 
Accordingly, a placebo-controlled RCT showed that response to 
n-3 PUFA supplementation (2g EPA and 1g DHA per day) for 
6 weeks was enhanced in individuals who reported being high 
consumers of dark-green vegetables.** These vegetables are good 
dietary sources of folate and lutein, a carotenoid, suggesting a 
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potential interaction between these nutrients and the absorption 
or bioavailability of n-3 PUFA. However, an another RCT, ancillary 
study of the SU.FOL.OM3 trial providing B vitamins (including 
560 g/d folate), EPA (400mg/d) and DHA (200mg/d) in 
individuals with a history of cardiovascular disease did not 
evidence any improvement of cognitive function after 4 years, 
except maybe in some specific subgroups.*” 


Healthy diets 


In observational studies, several models of healthy diets combin- 
ing multiple classes of nutrients have been found to be associated 
with lower cognitive decline and risk of dementia, the most widely 
studied being the Mediterranean diet.2?' A recent meta-analysis 
evidenced that higher adherence to a Mediterranean-type diet is 
associated with a reduced risk of developing MCI and AD.%° 
However, the construct of ‘Mediterranean diet’ may actually 
encompass very different food and nutrient intakes, depending on 
the population and geographical area, as most scoring instru- 
ments developed to assess adherence to this dietary model are 
based on sample-specific medians. 

Very few RCTs have assessed the impact of shifting to a global 
healthier diet on cognition. In the PREDIMED-NAVARRA study, 
participants at high vascular risk randomized to receive intensive 
advice to adopt a Mediterranean diet enriched with virgin olive oil 
or nuts scored significantly better than the low-fat control group 
on the Mini Mental Status Examination (MMSE), a global test of 
cognitive performance, and the Clock Drawing Test, assessing a 
wide range of high-level cognitive abilities including executive 
functions, after 6’-year follow-up.2” However, cognition was not 
measured at baseline, and the impact of the diets on cognitive 
decline could thus not be assessed. The Exercise and Nutritional 
Interventions for Cardiovascular Health RCT was designed to 
examine the effects of the Dietary Approach to Stop Hypertension, 
alone or in combination with weight management through 
physical exercise and calorie restriction, on blood pressure in 
overweight adults with high blood pressure. The Dietary Approach 
to Stop Hypertension diet was associated with a significantly 
improved psychomotor speed relative to controls.2% 


Supplements with multiple nutrients 


RCTs with supplements containing various combinations of 
nutrients trying to reproduce healthy diets can be used to provide 
a proof of concept of their global impact on cognition. 
A systematic review of 10 RCTs with multivitamins provided weak 
evidence of improvement of some cognitive abilities, especially 
immediate free recall.2? However, the few available studies were 
very heterogeneous regarding inclusion criteria, multivitamin 
constituents and cognitive outcomes. More convincingly, the 
Souvenir-Il_ placebo-controlled RCT showed that a 24-week 
supplementation with Souvenaid was able to improve memory 
in drug-naive patients with mild AD, that is, an MMSE score > 20.*° 
Souvenaid provides EPA, DHA, B vitamins, antioxidants (vitamins C 
and E, selenium), choline and uridine monophosphate, a specific 
combination intended to improve synaptic dysfunction in AD. 
These trials need to be replicated in populations with various 
degrees of cognitive impairment to identify the most responsive 
individuals in whom a supplementation could be recommended. 


BETTER TARGETING POTENTIAL BENEFICIARIES OF 
NUTRITIONAL INTERVENTIONS 


RCTs with nutritional supplements have been carried for the 
primary or secondary prevention of dementia or for treatment of 
patients with mild-to-moderate AD (MMSE range 10-26). Some 
may have failed to show any significant impact on cognition, 
either because they have included healthy participants who did 
not decline during the few months of the trial*' or, conversely, 
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individuals with AD whose disease was probably too severe to 
expect a significant effect from nutrition.*?“* Few studies have 
specifically targeted the MCI stage.*°“° Identifying individuals 
who are the most susceptible to benefit from a_ nutritional 
supplementation is therefore of upmost importance.*” 


Cognitive criteria and biomarkers of disease progression 

In its strict definition, primary prevention takes place before the 
development of the disease, that is, before any sign of 
neurodegeneration in the case of AD. Healthy diets in early life 
could contribute to develop brain reserve.*® However, the impact 
of interventions targeting this population on late life cognitive 
performance is obviously impossible to evidence. Thus primary 
and secondary prevention of dementia cannot be disentangled. 
However, cognitive decline is a relatively late event in the course 
of the disease, and it has been estimated that beta-amyloid, the 
hallmark of AD, accumulates sharply for about 15 years in the 
brain before reaching a plateau when the first clinical signs 
emerge.’? This 15-year interval would represent a large window 
for secondary prevention, especially with nutritional interventions. 
In the absence of specific impairment on most neuropsychological 
tests at this stage, biomarkers of disease progression could help to 
identify the best window of opportunity for RCTs with nutritional 
supplements. These biomarkers include brain atrophy as mea- 
sured by magnetic resonance imaging, impaired brain glucose 
metabolism and imaging of beta-amyloid accumulation with 
positron emission tomography and measurements of beta- 
amyloid and phosphorylated tau in CSF.°° However, in our present 
state of knowledge these should be considered only as research 
criteria and not used for screening individuals at high risk of 
dementia. 


Nutritional criteria 


Contrarily to RCTs with drugs, participants in nutritional interven- 
tions do not have a null basal level of the nutrient to be evaluated 
before any supplementation. There is a large day-to-day (intra- 
individual) variability, especially for nutrients brought by foods 
that are not consumed on a daily basis such as fish, the main 
dietary provider of EPA and DHA. In addition, there is a large inter- 
individual variability, depending on dietary habits. Both sources of 
variability decrease the power of the study by inflating confidence 
intervals and may hamper its external validity. Providing 
individuals who already meet their dietary requirements with 
additional quantities of nutrients is probably useless and may 
even be harmful when upper tolerable intake levels are 
surpassed.” 

Inclusion criteria should consider nutritional criteria such as low 
intake of EPA or DHA®*°* or low blood levels of long-chain n-3 
PUFA.* Global malnutrition assessed on generic tools such as 
the Mini-Nutritional Assessment could also help targeting 
older individuals who could benefit from a_ nutritional 
supplementation.°> 


Genetic characteristics 


Gene x diet interactions on cognition are still poorly understood. 
The epsilon 4 allele of the Apolipoprotein E (APOE4) gene, the 
main genetic risk factor for AD, is associated with poorer blood 
response to supplementation with n-3 PUFA®® or fish intake.°” 
Moreover, some epidemiological studies and RCTs have evidenced 
that APOE4 carriers had no benefit of n-3 PUFA supplementation 
or fish consumption on cognition.°*® Other genetic risk factors of 
AD involved in lipid metabolism could also interact with dietary 
n-3 PUFA.°° Further research is needed to better identify these 
interactions before considering genetic characteristics as inclusion 
criteria in RCTs with nutritional supplements. In the present state 
of knowledge, population screening of individuals based on their 
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genotype would be unethical, but it will probably emerge in the 
future with the perspective of a ‘personalized nutrition’. 


DESIGNING INNOVATIVE INTERVENTION STUDIES 


Inability of most RCTs with nutritional supplements to show any 
effect on cognitive performance may in part lie in lack of 
sensitivity of the selected outcomes to change over a relatively 
short period. Beside traditional criteria based on cognitive and 
functional decline, biomarkers could contribute to provide 
evidence of an impact of nutrients and shed new light into their 
mechanisms of action. 


Biomarkers of disease progression 


Few RCTs with nutritional supplements have used brain or CSF 
biomarkers of neurodegeneration as outcomes. These biomarkers 
could be more sensitive and capture early changes in the course 
of the disease that do not translate yet into improved cognitive 
performances. In the VITACOG RCT, the rate of brain atrophy over 
24 months measured by magnetic resonance imaging as primary 
outcome was significantly reduced by 30% by supplementation 
with B vitamins compared with placebo in 168 elderly subjects 
with MCI.*° Conversely, there was no effect of DHA supplementa- 
tion on brain volume change over 18 months in the subsample of 
102 participants with AD who had repeated magnetic resonance 
imaging examination in the Alzheimer's Disease Cooperative 
Study RCT.°? However, that subsample size was probably too small 
to achieve sufficient power. 

Forty-nine older adults (20 healthy and 29 with amnestic MCI 
(aMCl)) were randomized to follow a HIGH or LOW diet for 4 
weeks.*° The HIGH diet was high in fat, especially saturated fat, 
and with a high glycemic index. Conversely, the LOW diet was low 
in fat, especially saturated fat, and had a low glycemic index. In 
aMCI patients, the LOW diet increased beta42-amyloid concentra- 
tions in CSF but had the opposite effect in healthy adults, 
suggesting different mechanisms of action on the accumulation or 
clearance of beta-amyloid depending on the stage of the disease. 
The apolipoprotein E concentration in CSF was increased by the 
LOW diet and decreased by the HIGH diet. These changes in 
disease biomarkers were accompanied by improvement in 
delayed visual recall assessed by the Brief Visuospatial Memory 
Test with the LOW diet. 


Biomarkers of mechanism of action 


Many biomarkers of putative mechanisms of action of nutrients 
can also be proposed, preferably as secondary outcomes, in RCTs. 
These include principally measures of inflammation,©° oxidative 
stress assessed by isoprostanes in CSF,“*“° homocysteine®’ and 
markers of insulin resistance.*© 

In the LOW/HIGH RCT cited above,*® the CSF insulin concentra- 
tion increased with the LOW diet in aMCl patients, whereas the 
HIGH diet lowered the CSF insulin concentration for healthy 
adults. As an interpretation of these findings, the authors suggest 
that restoration of normal insulin concentration and activity in 
central nervous system may have beneficial effects, such as 
protection against synaptotoxicity of beta-amyloid oligomers. The 
peripheral metabolic profile of participants was also changed as 
the HIGH diet increased and the LOW diet decreased plasma lipids 
and insulin concentration in both groups, indicating improved 
insulin sensitivity with the LOW diet. The LOW diet also had a 
favourable impact on oxidative stress with reduced F2-isoprostane 
concentrations in CSF.*° 

However, some inconsistent results may arise from cognitive 
and biomarker outcomes. For instance, most RCTs with B vitamins 
have shown a non-significant impact on cognition’? despite a 
strong homocysteine-lowering effect. Paradoxically, supplementa- 
tion with a cocktail of antioxidant nutrients was able to decrease 
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oxidative stress as shown by CSF isoprostanes but increased 
cognitive decline.** 

The considerable development of omics technologies will give 
new insight into the potential metabolic pathways involved in the 
link between nutrition and brain functioning and provide new 
integrative biomarkers that will help understand their effects. For 
example, lipidomic profiling recently identified a set of 10 lipids in 
plasma that predicted short-term conversion to aMCl or AD with a 
very high accuracy in healthy elderly persons.®? These phospho- 
lipids have essential structural and functional roles in cell 
membranes, suggesting that their peripheral blood levels could 
be an early correlate of neurodegeneration in AD. Moreover, this 
specific lipid profile could be used as inclusion criteria in more 
efficient RCTs of nutritional interventions with lipids for the 
prevention of cognitive decline. 


CONCLUSION 


Despite the disappointing results of nutritional interventions on 
cognition so far, there is considerable room for improvement and 
more evidence-based knowledge on the link between nutrition 
and cognitive decline in older persons. Trials with nutritional 
supplements should learn from RCTs with drugs regarding 
biomarkers of disease progression for inclusion criteria and 
outcomes. Progress in genetics and omics technologies will also 
offer new opportunities for research in this field. 
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Ageing neuronsneed REST, 2 


Individuals who develop Alzheimer’s 
disease typically exhibit neuronal 
loss in the hippocampus and cortex. 
In the healthy ageing brain, both 

of these neuronal populations are 
preserved, but little is known about 
the mechanisms that protect neurons 
against stress and toxic insults. 
Yankner and colleagues now show 
that the transcriptional repressor 
REST (RE1-silencing transcription 
factor) has a crucial role in 
neuroprotection during ageing. 

The authors carried out a 
bioinformatic analysis of previous 
transcriptional profiling studies of 
the ageing brain, which suggested 
that REST activation may underlie 
changes in gene expression during 
ageing. To explore this possibility, 
they examined post-mortem samples 
of the prefrontal cortex (PFC) from 
young adult and aged individuals. 
They found that ageing is associated 
with a significant induction of REST 
in neuronal nuclei together with 
increased REST binding to target 
genes. By contrast, nuclear REST 
expression was substantially reduced 
in individuals with mild cognitive 
impairment (MCI) and almost absent 
in those with Alzheimer’s disease. 
Furthermore, REST levels closely 
correlated with cognitive function 
scores derived from longitudinal 
neuropsychometric testing. 

These findings suggest that 
deregulation of genes targeted by 
REST might occur in Alzheimer’s 
disease. To determine which genes 
are involved, the authors carried out 
chromatin immunoprecipitation and 
deep sequencing on a neural cell line 
and confirmed the results in neuronal 
nuclei isolated from the human PFC. 


This showed that REST 
regulates many genes asso- 
ciated with cell death 
pathways and 
Alzheimer’s 
disease 
pathogenesis. 
Analysis of human 
brain samples dem- 
onstrated reduced REST 

binding and increased expression of 
many of these genes in individuals 
with Alzheimer’s disease. 

The repression of cell death- 
associated genes suggests that REST 
might be neuroprotective. Indeed, 
the authors found that neuronal 
cultures derived from conditional 
knockout mice lacking REST were 
more vulnerable than control 
cultures to degeneration and cell 
death induced by oxidative stress or 
incubation with toxic oligomers of 
amyloid-B (Af). Furthermore, mice 
lacking REST exhibited progres- 
sive neurodegeneration, including 
neuronal loss in the hippocampus 
and cortex. 

What are the mechanisms by 
which REST is boosted in the 
healthy ageing brain? The authors 
demonstrated the induction of 
REST in primary cultures of cortical 
neurons exposed to various stressors 
associated with ageing. REST was also 
induced in a neural cell line exposed 
to the medium in which the ‘stressed’ 
neurons had been cultured or to 
extracts of aged human brain, sug- 
gesting that a cell-non-autonomous 
pathway is involved. Further experi- 
ments showed that enhancing signal- 
ling through the WNT-f-catenin 
pathway induced REST and that 
B-catenin levels were increased in 


the aged PFC and colocalized in the 
nucleus with REST, suggesting that 
this pathway contributes to the 
induction of REST in the ageing brain. 

These findings suggest that loss 
of neuroprotective nuclear REST 
contributes to neuronal cell death in 
Alzheimer’s disease, prompting the 
authors to consider the mechanisms 
underlying this loss. Dysregulation 
of autophagy, a process that can 
sequester proteins in the cytoplasm, 
occurs in several neurodegenerative 
diseases, and the authors showed 
that REST is present in autophago- 
somes together with misfolded 
proteins such as Af, tau and 
a-synuclein in several neuro- 
degenerative disorders. Moreover, 
activating autophagy in cultured 
neural cells reduced nuclear 
REST levels and resulted in REST 
appearing in autophagosomes. 

This study shows that REST induc- 
tion in the ageing brain is a crucial 
neuroprotective factor and suggests 
that boosting this response might pro- 
vide a strategy to combat age-related 
neurodegenerative disease, especially 
Alzheimer’s disease. 

Katherine Whalley 


ORIGINAL RESEARCH PAPER Lu, T. et al. REST 
and stress resistance in ageing and Alzheimer’s 


disease. Nature http://dx.doi.org/10.1038/ 
nature13163 (2014) 
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COMMENTARY 


FOCUS ON STRESS 


Stress and the brain: individual variability 
and the inverted-U 


Robert M Sapolsky 


It is a truism that the brain influences the body and that peripheral physiology influences the brain. Never is this 
clearer than during stress, where the subtlest emotions or the most abstract thoughts can initiate stress responses, 
with consequences throughout the body, and the endocrine transducers of stress alter cognition, affect and behavior. 
For a fervent materialist, few things in life bring more pleasure than contemplating the neurobiology of stress. 


The early years of the field of stress biology were 
dominated by the first half of the neuroendo- 
crine loop, namely the ability of the brain to 
initiate the body’s stress response. Stress physi- 
ology was born early in the last century, in the 
Paleolithic era of Walter Cannon implicating 
the sympathetic nervous system in the ‘fight or 
flight’ response, and in Hans Selye identifying 
glucocorticoids as the other main mediator of 
the stress response. These foundational find- 
ings took the field in many directions. One 
was the revolutionary work of Geoffrey Harris, 
Roger Guillemin and Andrew Schally showing 
that the brain is an endocrine gland, secreting 
releasing and inhibiting hormones into the 
hypothalamic-pituitary portal system; in many 
ways, the decades-long arc of that revolution was 
bracketed by stress research, in that corticotro- 
pin-releasing hormone was the first of the prin- 
cipal hypothalamic hormones whose existence 
was inferred physiologically and the last to be 
isolated and biochemically characterized. 

The half of the loop concerning the brain 
regulating the body also encompasses the 
historic welcoming of psychologists into the 
field; this came with the demonstration that 
the stress response, conceptualized in the con- 
text of acute physical crisis, can be robustly 
activated by purely psychological states, such 
as loss of control, predictability and social sup- 
port. And that half of the loop also incorpo- 
rates the fact, first deeply appreciated by Selye, 
that makes stress biology a branch of medicine: 
prolonged stress increases the odds of being 
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sick. This has facilitated the birth of other sub- 
fields (for example, psychoneuroimmunology), 
and is now an area of tremendous amounts of 
reductive research. As a result, we have a fairly 
good idea as to how, say, a fleeting, stressful 
thought changes transcriptional events relevant 
to oxidative metabolism in your big toe. 

In many ways, the pendulum has swung and, 
in recent decades, the field has come to be dom- 
inated by the second half of the loop, namely the 
effects of the peripheral stress response on the 
brain. This is certainly reflected in the collec- 
tion of papers in the present issue. Collectively, 
they highlight a number of important themes. 


Reductive mechanisms underlying stress 
effects in the brain 

A major one, naturally, has been the ever- 
increasing insights into the mechanisms by 
which stress affects the brain. For example, 
decades of work have fruitfully explored the 
disruptive effects of stress on hippocampal- 
dependent declarative memory processes 
and on frontocortical-dependent executive 
function and behavioral regulation. It is only 
in more recent years that we have gained 
insight into the signal transduction path- 
ways and transcriptional events that medi- 
ate these stress effects). Moreover, some of 
these mechanistic insights contain surprises 
that have upended dogma. One, for example, 
concerns neuroinflammation and the text- 
book knowledge that glucocorticoids are 
uniformly anti-inflammatory. However, it 
is now recognized that this is not always the 
case, and that glucocorticoids and stress can 
even worsen facets of neuroinflammation in 
a brain region-specific manner’; the novel 
depressogenic effects of neuroinflammation* 


provides a mechanistic route by which stress 
predisposes to depression. 


Early life stress and adult opportunities 
that should not be lost 

Another theme concerns the complex and 
supremely important intersection of brain 
development and adult neuroplasticity. A 
canonical body of knowledge shows how stress 
in early life, particularly in the perinatal period, 
can have predominately adverse neurobiologi- 
cal consequences stretching long into adult- 
hood?. These effects can have an extraordinarily 
long reach, changing the trajectory of brain 
aging, and even having multi-generational 
effects, through the non-genetic transmission of 
behavioral and physiological traits. The mediat- 
ing mechanism for these long-term effects has 
increasingly been shown to be epigenetic, a cur- 
rent focus of intense amounts of work. 

Running in parallel with this is the evidence 
of plasticity in the adult nervous system. The 
excitability of synapses change, dendritic 
spines come and go within minutes, dendritic 
processes expand or retract, and circuitry 
remaps. And then there is, of course, the revo- 
lution of adult neurogenesis. Little in the brain, 
it turns out, is set in stone. 

When combined, these two sets of findings 
produce conclusions that are both salutary 
and alarming and should galvanize action. 
First, early life adversity can leave broad and 
permeating scars of neurobiological dysfunc- 
tion long into the future, even unto the pro- 
verbial generations. Second, there is far more 
potential for lessening, halting or even revers- 
ing these consequences of early life stress in 
the adult than anyone could have imagined. 
Third, the longer the intervention is delayed, 
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the more of an uphill battle there will be to be 
make things better. 


Multi-level meanings of stress effects 
Another theme is the disconnect between the 
‘meaning’ of a stress effect on the neuron, the 
brain structure and the organism. A prime 
example of this concerns the hippocampus, 
frontal cortex and amygdala®. Extensive work 
has shown that, in the first two structures, stress 
and glucocorticoid excess will decrease synap- 
tic plasticity, cause atrophy of dendritic pro- 
cesses and even cause a loss of total volume or 
gray matter volume. These findings have been 
appropriately interpreted as bad things for the 
organism, and underlie some of the adverse neu- 
rocognitive consequences of depression, post- 
traumatic stress disorder and childhood poverty. 
A different picture occurs in the amygdala, par- 
ticularly the basolateral amygdala, where stress 
and glucocorticoids increase synaptic plastic- 
ity and foster expansion of dendritic processes 
(which raises challenging questions regarding 
the mechanisms by which glucocorticoid sig- 
naling has diametrically opposite effects in 
these regions). At first glance, this can readily 
be interpreted as a good thing—after all, who 
wouldn't want their neurons to be more strongly 
and broadly integrated into circuits? However, 
this amygdaloid hypertrophy can be anything 
but beneficial, as it contributes to the adverse 
effects of stress on fear acquisition and extinc- 
tion: when we are stressed, we learn more readily 
to be afraid when there is no need to and less 
readily detect when we are safe. The road to a 
crippling anxiety disorder is paved with perky 
amygdaloid synapses. 


The ubiquitous, but nonspecific, role of 
stress in psychiatric disorders 

These amygdaloid effects serve as a segue to the 
ever stronger evidence for stress as a risk fac- 
tor for an array of neuropsychiatric disorders, 
including depression, anxiety, schizophrenia and 
various addictive behaviors”*. These linkages 
both take the form of early life stress predisposing 
toward adult illness and periods of acute stress 
during adulthood triggering episodes of disease. 
The breadth of these stress effects implies, at the 
same time, their nonspecificity. A good example 
of this is seen with the immunophilin FKBP5, 
which, as a glucocorticoid receptor co-factor, is 
highly pertinent to stress neuroendocrinology; 
variants of the FKBP5 gene are associated with 
altered risks of depression, anxiety and PTSD®. 
As another example, consider the DISC-1 
gene (Disrupted in Schizophrenia-1), whose 
cytoskeletal protein product interacts exten- 
sively with the signal transduction pathways of 
stress signals; despite the specificity implied by 
DISC-1’s name, abnormalities in the structure 
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Figure 1 Conceptualization of the inverted-U in the context of the benefits and costs of stress. A broad 
array of neurobiological endpoints show the same property, which is that stress in the mild-to-moderate 
range (roughly corresponding to 10-20 yg dl- of corticosterone, the species-specific glucocorticoid 

of rats and mice) has beneficial, salutary effects; subjectively, when exposure is transient, we typically 
experience this range as being stimulatory. In contrast, both the complete absence of stress, or stress that 
is more severe and/or prolonged than that in the stimulatory range, have deleterious effects on those same 
neurobiological endpoints. The absence of stress is subjectively experienced as understimulatory by most, 
whereas the excess is typically experienced as overstimulatory, which segues into ‘stressful’. Many of the 
inverted-U effects of stress in the brain are explained by the dual receptor system for glucocorticoids, 
where salutary effects are heavily mediated by increasing occupancy of the high-affinity, low-capacity MRs 
and deleterious effects are mediated by the low-affinity, high-capacity GRs. 


or regulation of the protein have been implicated 
not just in schizophrenia, but in depression and 
bipolar disorder as well’. 

Appreciating the importance of stress is 
critical for understanding the neurogenetics of 
psychiatric illness, as stress is the poster child 
for the environment part of gene x environ- 
ment interactions. Beginning with the land- 
mark demonstration of the role of serotonin 
transporter gene polymorphisms in depres- 
sion risk!°, the genetics of mental illness has 
been repeatedly shown to be about stress/dias- 
thesis and about vulnerability to stress. 

Thus, a variety of themes appear in these 
papers. Two even larger ones at least tacitly 
run through all of them and, I hope, will shape 
research in stress neurobiology for years to come. 


The humdrum and the fascinating versions 
of the obligatory question of what is stress 

Seemingly within moments of Selye populariz- 
ing the word stress in the world of biomedicine, 


the definitional debate began. Is stress more 
about the unpleasantry in the outside world 
(that is, the stressor) or the resulting changes 
in the body (that is, the stress response)? Or is 
it mostly about the neurobiological and psy- 
chological space floating between the two? 
This eventually wearisome debate inevitably 
constituted the first session of virtually every 
stress conference for decades; it has finally lost 
steam, with a sense that the word encompasses 
all of the above—let a thousand flowers bloom, 
but just remember to define your particular 
flower in the Methods section. 

The far more interesting version of this ques- 
tion addresses the fact that in any species youd 
care to study, different individuals respond to 
stress differently; there are typically dramatic 
individual differences as to whether a particu- 
lar event or internal state is even perceived to 
be stressful. In other words, what is stress...for 
this individual? Of course, individual variability 
is not always the case; a severe injury, a major 
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burn or a sprint from a predator will reliably 
activate the stress response and evoke an aver- 
sive subjective sense in virtually any organism. 
But these are not the circumstances of stress that 
are most pertinent to understanding health and 
disease in contemporary life. Instead, individual 
differences are most notable as we navigate life’s 
social exigencies. 

Individual differences in stress biology were 
once mostly an experimental irritant: oh no, 
because of variability we need a bigger sam- 
ple size. However, individual variability as to 
whether something is perceived as stressful, and 
in resilience and vulnerability to stress-related 
disease, should be viewed as the most important 
topic in the field. After all, one person's stress 
envelope is pushed by getting up at the crack of 
dawn to bird-watch, and another's by a stint as 
a mercenary in Somalia. To best appreciate the 
importance of individual differences in stress 
responsiveness, it is worth focusing on the single 
most important concept in the field. 


The inverted-U 

When viewed from a distance, the effects of 
stress on the brain and behavior are often worse 
than murky, where a stressor might increase 
some endpoint in one setting, have no effect in 
another and decrease it in a third. To anyone 
working in the field, what was apparent from the 
start was that the response to stress depends on 
the nature, intensity and duration of a stressor 
(which at least partially translates into a depen- 
dence on the pattern of activation of the sympa- 
thetic nervous system, the adrenocortical axis 
and the other mediators of the stress response). 

This represents progress and allows one to 
begin to corral that array of heterogeneous 
and often diametrically opposite findings into 
some groupings: for example, the contrasting 
responses to physical versus psychological 
stressors, to biotic versus abiotic stressors, to 
continuous versus intermittent stressors, and 
so on. Enormous unifying clarity came with 
the recognition that, to a large extent, the 
effects of stress in the brain form a nonlinear 
‘inverted-U’ dose-response curve as a function 
of stressor severity: the transition from the 
complete absence of stress to mild stress causes 
an increase in endpoint X, the transition from 
mild-to-moderate stress causes endpoint X to 
plateau and the transition from moderate to 
more severe stress decreases endpoint X. 

A classic example of the inverted-U is seen 
with the endpoint of synaptic plasticity in the 
hippocampus, where mild-to-moderate stress- 
ors, or exposure to glucocorticoid concentra- 
tions in the range evoked by such stressors, 
enhances primed burst potentiation, whereas 
more severe stressors or equivalent eleva- 
tions of glucocorticoid concentrations do the 


opposite!!, This example also demonstrates 
an elegant mechanism for generating such an 
inverted-U!*. Specifically, the hippocampus 
contains ample quantities of receptors for glu- 
cocorticoids. These come in two classes. First, 
there are the high-affinity low-capacity miner- 
alocorticoid receptors (MRs), which are mostly 
occupied under basal, non-stress conditions 
and in which occupancy increases to saturat- 
ing levels with mild-to-moderate stressors. 
In contrast, there are the low-affinity, high- 
capacity glucocorticoid receptors (GRs), which 
are not substantially occupied until there is 
major stress-induced glucocorticoid secretion. 
Critically, it is increased MR occupancy that 
enhances synaptic plasticity, whereas increased 
occupancy of GRs impairs it; the inverted-U 
pattern emerges from these opposing effects. 

There are abundant additional examples of 
inverted-U effects of stress in the hippocam- 
pus; this is also the case elsewhere in the brain 
(mediated by mechanisms other than the MR/ 
GR duo, which is mostly specific to the hip- 
pocampus)!*"18, 

Thus, a morass of conflicting data is elimi- 
nated by recognizing the prevalence of inverted- 
Us. But even greater insight is provided when 
considering the collective nature of the various 
inverted-U’s; in general, the effects of mild-to- 
moderate stress (that is, the left side of the U) 
are salutary, whereas those of severe stress are 
the opposite. In other words, it is not the case 
that stress is bad for you. It is major stress that is 
bad for you, whereas mild stress is anything but; 
when it is the optimal amount of stress, we love it. 

What constitutes optimal good stress? It 
occurs in a setting that feels safe; we voluntarily 
ride a roller coaster knowing that we are risk- 
ing feeling a bit queasy, but not risking being 
decapitated. Moreover, good stress is transient; 
it is not by chance that a roller coaster ride is 
not 3 days long. And what is mild, transient 
stress in a benevolent setting? For this we have 
a variety of terms: arousal, alertness, engage- 
ment, play and stimulation (Fig. 1). 

Entire careers are spent exploring differ- 
ent parts of the inverted-U. At the far left is 
the realm of an under-stimulatory environ- 
ment, with profoundly adverse effects seen 
in impoverished environments ranging from 
childhood (with, for example, the nightmar- 
ish Romanian orphanages as an extreme) to 
old age, from humans to zoo animals in sterile 
cages. The upswing of the inverted-U is the 
domain of any good educator who intuits the 
ideal space between a student being bored and 
being overwhelmed, where challenge is ener- 
gized by a well-calibrated motivating sense of 
‘maybe’; after all, it is in the realm of plausible, 
but not guaranteed, reward that anticipatory 
bursts of mesolimbic dopamine release are the 


greatest!?. And the downswing of the inverted- 
U is, of course, the universe of “stress is bad for 
you”. Thus, the ultimate goal of those studying 
stress is not to ‘cure us of it, but to optimize it. 


Individual differences meet the inverted-U 
It is useful to superimpose the world of indi- 
vidual differences in stress responsiveness and 
vulnerability onto the inverted-U concept, as it 
allows one to frame the differences in terms of 
the width, height or symmetry of the U-shaped 
curve. Most of all, it allows one to hone in ona 
critical question. For any particular stressor, set- 
ting and context, where along the axis of stressor 
severity is the peak of an individual's inverted- 
U? In other words, what is the point of transi- 
tion at which someone’s experience turns from 
stimulation to stress, from striving to learned 
helplessness, from growing from challenge to 
crumbling? And from this come additional 
questions. What are the mechanisms by which 
adversity shifts inverted-U’s to the left (that is, 
an increased vulnerability to subsequent stress)? 
What conditions foster inverted-U’s that are 
shifted to the right (that is, resilience)? 

These questions should bea primary focus of 
the field for years to come. It is indisputable that 
extremes of stress are bad for the brain, a fact 
pertinent to developmental neuroscience, neu- 
rogerontology, and everything in between. It is 
also equally indisputable that optimal amounts 
of stress enrich and sustain us. Hopefully, as 
knowledge continues to accumulate at the rate 
showcased in this issue, we will gain the means 
to spend more of our lives experiencing the left 
sides of our inverted-Us. 
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